2015Q03

Jul 16, 2015

Release 0.99.0.

Sep 25, 2015

Release 0.101.1.

2015-07-15 [amo]

Misc

  • [o] Make query bi=放射線を照射する放射線源と @ DEPATISnet not croak
  • [o] Datasource “SDP Data Proxy” I
    • “Siemens AND Bosch” vs. (Siemens AND Bosch) vs. Siemens AND Bosch
    • Query recorder works?
    • What about truncation and proximity search?
    • What about umlauts and others? e.g. Ständer, poussée
    • Make classes and date/time ranges work
      • Valid date expressions should be:
        • pd:[19800101 TO 19851231]
        • pd:[* TO 19601231]
        • pdyear:[1980 TO 1985]
        • pdyear:[* TO 1960]
    • Expert search: CQL field symbols for IFI; 1st iteration
    • Make crawler work
    • Measure worth of whole SDP addon using sloccount (through commit 12bd07c23ea4eb4b7ebb16f95faed5c30b9a3d5b) => $19.301,-

Datasource “SDP Data Proxy” II

  • [o] times out each 10 minutes; be proactively on that like ftpro/search.py
    1. SearchException: SDP: search failed. status_code=403 FORBIDDEN, content=timeout
    2.
    2015-07-16 16:41:31,435 ERROR [elmyra.ip.access.sdp.dataproxy][MainThread] SDP login failed. status_code=500, content=<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"> <title>500 Internal Server Error</title> <h1>Internal Server Error</h1> <p>The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.</p>
  • [o] Display raw results from upstream
  • [o] Expert search: CQL field symbols for IFI
    • [o] Put examples into field knowledgebase
    • [o] Add more/all fields into chooser
    • [o] Put fields from “IFI Integrated Data” into chooser
  • [o] More fine granular searching in fulltext (title vs. abstract vs. claims vs. description) using modifiers like FulltextPRO
  • [o] Refine number normalization specific to SDP/IFI
  • [o] SdpExpression: parse expression more sophisticated to make things like “EP666666 or EP666667” or
    “?query=pn%3AEP666666&datasource=sdp” possible
  • [o] SdpExpression: Make it grok the Solr query to support highlighting in expert mode
  • [o] Delivers e.g. IN2009KN02715A, IN2011MU02282A, IN2011MU02281A, IN2011MU02280A which are not available from OPS.
  • [o] Nothing to see for pn=KR20150081632A or pn=KR20150081562A or pn=KR20150082258A or pn=KR20150082290A or pn=KR20150082307A or pn=KR20150082340A or pn=KR20150082189A or pn=KR20150082484A or pn=KR20150082605A or pn=KR20150082609A
    when querying for “GB or DE”
  • [o] Write tests for expression parser
  • [o] Make result sorting controllable by GUI; e.g. by date
  • [o] Hotkeys for switching to datasource SDP; add to https://patentsearch-develop.elmyra.de/help
  • [o] Crawling SDP might deliver duplicates, when having “many” results.
    Proof:
    https://patentsearch-staging.elmyra.de/?datasource=sdp&query=pa%3A(Siemens+AND+Bosch)
    Total: 45,383
    
    $ cat testlist.txt | wc -l
    49999
    
    $ cat testlist.txt | sort | uniq | wc -l
    34992
    
    Fix this by either crawling more pages upfront or by adaptively adjusting by actually counting the distinct result numbers and continuing the crawler operation, until total_count is reached or no more numbers seem to be added to the set/pool.
  • [o] Implement more specs from “CLAIMSDirect SOLR v2-0-part1-Sept2014.pdf”

  • [o] Problem fetching description from Espacenet for e.g. a newer US document like “USPP25706P2”:
          return espacenet_description(patent)
        File "/Users/amo/dev/elmyra/elmyra.ip.access.epo/elmyra.ip.access.epo/elmyra/ip/access/epo/espacenet.py", line 26, in espacenet_description
          description = soup.find('div', {'id': 'description'}).find('p')
      AttributeError: 'NoneType' object has no attribute 'find'
    
  • [o] Why does Fetching the claims/description of DE112013005111A5 go to espacenet.com?
  • [o] Query history: rename “cql” to “expert”
  • [o] Show “time required” in search status bar (take from metadata or generate on our own)
  • [o] Encapsulate UpstreamResults into unified subsystem
  • [o] Add more fields to comfort form (SDP is rich!)
  • [o] Visualization of keywords in text (histogram)
  • [o] Alternative Darstellung, bei der der Text mehr im Vordergrund steht.
    Anstatt des Bildes könnte eine verkleinerte Ansicht des Textes stehen, in der die Schlüsselwörter auch markiert sind und mit Sprungmarken versehen sind.
  • [o] Nach Absetzen einer Anfrage möchte man vielleicht ein Histogramm der Länderverteilung sehen, oder auch nach anderen Kriterien, z.B. Anmelder oder Klasse.
  • [o] Fix /api/ops/US5467352A/pdf/all => Internal Server Error
  • [o] Chrome 43.0.2357.134 warns with:
    Synchronous XMLHttpRequest on the main thread is deprecated because of its detrimental effects to the end user’s experience. For more help, check http://xhr.spec.whatwg.org/.
  • [o] Progress indicator when crawling huge numberlist
  • [o] Query for “siemens” and 2000-2015
    2015-07-16 05:13:33,083 ERROR [elmyra.ip.util.numbers.common][Dummy-5] Could not parse patent number "IND256512S"
    
  • [o] Discrepancies between OPS- and Raw Number formats re. “bosch” and 2000-2015 @ FulltextPRO: US2015184644A1 vs. US20150184644A1
  • [o] /help => rename “Open CQL field chooser”: Strip “CQL”; as well “Use a text area for submitting a CQL query expression” => reword!

2015-08-05 [amo]

17.91.60.141 - - [05/Aug/2015:09:47:56 +0200] "GET /api/ops/EP0731339/kindcodes HTTP/1.1" 502 172 "-" "Python-urllib/2.7"
217.91.60.141 - - [05/Aug/2015:09:47:56 +0200] "GET /api/ops/EP0570444/kindcodes HTTP/1.1" 502 172 "-" "Python-urllib/2.7"
217.91.60.141 - - [05/Aug/2015:09:47:57 +0200] "GET /api/ops/EP1425860/kindcodes HTTP/1.1" 502 172 "-" "Python-urllib/2.7"
217.91.60.141 - - [05/Aug/2015:09:48:32 +0200] "GET /api/ops/DE102015101697/kindcodes HTTP/1.1" 404 108 "-" "Python-urllib/2.7"
217.91.60.141 - - [05/Aug/2015:09:48:33 +0200] "GET /api/ops/DE102014014694/kindcodes HTTP/1.1" 404 108 "-" "Python-urllib/2.7"
217.91.60.141 - - [05/Aug/2015:09:48:33 +0200] "GET /api/ops/DE102014016037/kindcodes HTTP/1.1" 404 108 "-" "Python-urllib/2.7"
217.91.60.141 - - [05/Aug/2015:12:10:18 +0200] "GET /api/ops/WO2014044927/kindcodes HTTP/1.1" 502 172 "-" "Python-urllib/2.4"
217.91.60.141 - - [05/Aug/2015:12:10:19 +0200] "GET /api/ops/WO2014047466/kindcodes HTTP/1.1" 502 172 "-" "Python-urllib/2.4"
217.91.60.141 - - [05/Aug/2015:12:10:19 +0200] "GET /api/ops/WO2014046768/kindcodes HTTP/1.1" 502 172 "-" "Python-urllib/2.4"
217.91.60.141 - - [05/Aug/2015:12:11:47 +0200] "GET /api/ops/WO2014167377/kindcodes HTTP/1.1" 502 172 "-" "Python-urllib/2.4"
217.91.60.141 - - [05/Aug/2015:12:11:47 +0200] "GET /api/ops/WO2014072775/kindcodes HTTP/1.1" 502 172 "-" "Python-urllib/2.4"
217.91.60.141 - - [05/Aug/2015:12:12:38 +0200] "GET /api/ops/WO2014043816/kindcodes HTTP/1.1" 502 172 "-" "Python-urllib/2.4"
79.214.7.234 - - [05/Aug/2015:13:12:41 +0200] "GET /api/ops/publication/WO2014168261/family/inpadoc HTTP/1.1" 504 182 "https://patentsearch.elmyra.de/" "Mozilla/5.0 (Windows NT 6.3;
79.214.7.234 - - [05/Aug/2015:13:12:48 +0200] "GET /api/ops/publication/WO2014168261/family/inpadoc HTTP/1.1" 504 182 "https://patentsearch.elmyra.de/" "Mozilla/5.0 (Windows NT 6.3;
79.214.7.234 - - [05/Aug/2015:13:19:52 +0200] "GET /api/ops/publication/WO2014168261/family/inpadoc?constituents=biblio HTTP/1.1" 504 182 "https://patentsearch.elmyra.de/" "Mozilla/5.0
82.135.66.130 - - [05/Aug/2015:14:33:00 +0200] "GET /api/ops/DE102013225223A1/image/info HTTP/1.1" 504 182 "https://patentsearch.elmyra.de/" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:39.0)
80.152.196.150 - - [05/Aug/2015:14:41:43 +0200] "GET /api/drawing/US5339607A?page=2 HTTP/1.1" 504 182 "https://patentsearch.elmyra.de/?numberlist=DE4208818%2CDE1066478%2CFR2612487%2CDE4208818%2CEP0749902%2CEP0162933%2CUS5339607&datasource=ops" "Mozilla/5.0 (Windows NT 6.1; rv:39.0) Gecko/20100101 Firefox/39.0"
80.152.196.150 - - [05/Aug/2015:14:41:44 +0200] "GET /api/drawing/US5339607A?page=3 HTTP/1.1" 504 182 "https://patentsearch.elmyra.de/?numberlist=DE4208818%2CDE1066478%2CFR2612487%2CDE4208818%2CEP0749902%2CEP0162933%2CUS5339607&datasource=ops" "Mozilla/5.0 (Windows NT 6.1; rv:39.0) Gecko/20100101 Firefox/39.0"
84.153.161.142 - - [05/Aug/2015:14:41:46 +0200] "GET /api/drawing/US5339607A?page=4 HTTP/1.1" 504 182 "https://patentsearch.elmyra.de/?numberlist=DE4208818%2CDE1066478%2CFR2612487%2CDE4208818%2CEP0749902%2CEP0162933%2CUS5339607&datasource=ops" "Mozilla/5.0 (Windows NT 6.1; rv:39.0) Gecko/20100101 Firefox/39.0"

2015-08-06 [amo]

2015-08-07 [amo]


82.135.91.31   - - [03/Aug/2015:14:41:40 +0200] "GET /api/ops/published-data/search?query=txt%3D((rundum*+near%2C+3+*kamera*)+and+*bewegung*)&range=1-25 HTTP/1.1" 400 1155 "https://patentsearch.elmyra.de/?query=pn%3DDE112007001053B4&project=R21892K+Airblade+mit+horizontalen+K%C3%BChlluftklappen&datasource=ops" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.125 Safari/537.36"
80.152.196.150 - - [03/Aug/2015:15:36:04 +0200] "GET /api/ops/published-data/search?query=pn%3D%22US2013%2F0126991%22&range=1-10 HTTP/1.1" 401 576 "https://patentsearch.elmyra.de/?query=pn%3DWO2012079181A1&datasource=ops" "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.125 Safari/537.36"
80.152.196.150 - - [03/Aug/2015:15:36:18 +0200] "GET /api/ops/published-data/search?query=pn%3D%22US+2013%2F0126991A1%22&range=1-10 HTTP/1.1" 401 579 "https://patentsearch.elmyra.de/?query=pn%3DWO2012079181A1&datasource=ops" "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.125 Safari/537.36"
80.152.196.150 - - [03/Aug/2015:15:36:19 +0200] "GET /api/ops/published-data/search?query=pn%3D%22US+2013%2F0126991A1%22&range=1-10 HTTP/1.1" 401 579 "https://patentsearch.elmyra.de/?query=pn%3DWO2012079181A1&datasource=ops" "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.125 Safari/537.36"
157.55.39.149  - - [02/Aug/2015:13:30:54 +0200] "GET /robots.txt HTTP/1.1" 302 160 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
79.214.0.47    - - [06/Aug/2015:11:12:02 +0200] "GET /api/drawing/US2015028183A1 HTTP/1.1" 502 172 "https://patentsearch.elmyra.de/?query=pn%3DDE19829778A1&project=Traktion&datasource=ops" "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:39.0) Gecko/20100101 Firefox/39.0"
88.217.10.11   - - [06/Aug/2015:11:27:07 +0200] "GET /api/drawing/DE29714816U1 HTTP/1.1" 502 210 "https://patentsearch.elmyra.de/?query=ipc%3D%22E05D15%2F00%22&datasource=ops" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:39.0) Gecko/20100101 Firefox/39.0"
88.217.10.11   - - [06/Aug/2015:13:13:45 +0200] "GET /api/pdf/EP2634455A4 HTTP/1.1" 502 172 "https://patentsearch.elmyra.de/" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:39.0) Gecko/20100101 Firefox/39.0"
88.217.10.11   - - [06/Aug/2015:14:11:49 +0200] "GET /api/ops/DE102008045899A1/image/info HTTP/1.1" 502 176 "https://patentsearch.elmyra.de/" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:39.0) Gecko/20100101 Firefox/39.0"
82.135.91.31   - - [07/Aug/2015:09:21:07 +0200] "GET /api/ops/DE102013210503A1/image/info HTTP/1.1" 502 176 "https://patentsearch.elmyra.de/" "Mozilla/5.0 (Windows NT 6.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.130 Safari/537.36"
88.217.10.11   - - [07/Aug/2015:10:53:44 +0200] "GET /api/pdf/EP2375104A4 HTTP/1.1" 502 172 "https://patentsearch.elmyra.de/" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:39.0) Gecko/20100101 Firefox/39.0"
79.214.15.171  - - [07/Aug/2015:12:05:30 +0200] "GET /api/drawing/WO2009064264A1 HTTP/1.1" 502 213 "https://patentsearch.elmyra.de/" "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:39.0) Gecko/20100101 Firefox/39.0"
62.216.198.234 - - [07/Aug/2015:17:03:00 +0200] "GET /api/depatisconnect/USRE30842E/claims HTTP/1.1" 502 172 "https://patentsearch.elmyra.de/" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:39.0) Gecko/20100101 Firefox/39.0"
62.216.198.234 - - [07/Aug/2015:17:03:05 +0200] "GET /api/depatisconnect/USRE30842E/claims HTTP/1.1" 502 172 "https://patentsearch.elmyra.de/" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:39.0) Gecko/20100101 Firefox/39.0"
62.216.198.234 - - [07/Aug/2015:17:03:07 +0200] "GET /api/depatisconnect/USRE30842E/claims HTTP/1.1" 502 172 "https://patentsearch.elmyra.de/" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:39.0) Gecko/20100101 Firefox/39.0"
62.216.198.234 - - [07/Aug/2015:17:03:10 +0200] "GET /api/depatisconnect/USRE30842E/claims HTTP/1.1" 502 172 "https://patentsearch.elmyra.de/" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:39.0) Gecko/20100101 Firefox/39.0"
62.216.198.234 - - [07/Aug/2015:17:03:13 +0200] "GET /api/depatisconnect/USRE30842E/claims HTTP/1.1" 502 172 "https://patentsearch.elmyra.de/" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:39.0) Gecko/20100101 Firefox/39.0"
2.207.208.89   - - [07/Aug/2015:19:57:03 +0200] "GET /api/depatisconnect/USRE30842E/claims HTTP/1.1" 502 574 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/43.0.2357.134 Safari/537.36"
2.207.208.89   - - [07/Aug/2015:19:57:43 +0200] "GET /api/depatisconnect/USRE30842E/claims HTTP/1.1" 502 574 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/43.0.2357.134 Safari/537.36"
  • [o] More http status != 200 – fix them:
    /api/pdf/EP2375104A4
    /api/pdf/EP2634455A4
    /api/depatisconnect/USRE30842E/claims
  • [o] Scrollbar indicating “where am i?”

2015-08-31 [amo]

2015-09-06 [amo]

2015-09-28 [gas]

Topic “US Zeichnungen”

Anbei ein paar US Schriften mit Korrektur-Seite als erste Seite. Damit zu recherchieren ist natürlich lästig, weil keine Zeichnungen zu sehen sind.

US2010252183
US2008216963
US2008196825

Analyse

  • US2010252183 wird gelistet als US2010252183A1
  • Certificate of Correction wird geliefert via “published-data/images/US/8052819/X6/fullimage”
  • Full-cycle member “US8052819B2” enthält Bilder via published-data/images/US/8052819/B2/fullimage.tiff?Range=3
    • Bilder außerdem via published-data/images/US/20100252183/A1/fullimage.tiff?Range=3
      => Lösung: Bilder von full-cycle member “B2” holen

Patentnummern der Familienmitglieder als Text zum Kopieren

Außerdem wäre es noch schön, wenn man bei der Auflistung der ganzen Familie die Mitglieder markieren und in ein Textprogramm kopieren könnte.


Verschiedene Laufzeitfehler

A DPMA error:

2015-09-28 15:10:29,449 ERROR [elmyra.ip.access.epo.services.dpma][MainThread] DEPATISnet search error: query="(pc=de or pc=ep) and (ic=b29c70/38?) and py >= 2000 and py <= 2015", reason=list index out of range, Exception was:
Traceback (most recent call last):
  File "/opt/elmyra/patentsearch/sites/prod/.venv27/local/lib/python2.7/site-packages/elmyra/ip/access/epo/services/dpma.py", line 61, in depatisnet_published_data_search_handler
    return dpma_published_data_search(query, request_size)
  File "/opt/elmyra/patentsearch/sites/prod/.venv27/local/lib/python2.7/site-packages/beaker/cache.py", line 585, in cached
    return cache[0].get_value(cache_key, createfunc=go)
  File "/opt/elmyra/patentsearch/sites/prod/.venv27/local/lib/python2.7/site-packages/beaker/cache.py", line 317, in get
    return self._get_value(key, **kw).get_value()
  File "/opt/elmyra/patentsearch/sites/prod/.venv27/local/lib/python2.7/site-packages/beaker/container.py", line 378, in get_value
    v = self.createfunc()
  File "/opt/elmyra/patentsearch/sites/prod/.venv27/local/lib/python2.7/site-packages/beaker/cache.py", line 583, in go
    return func(*args)
  File "/opt/elmyra/patentsearch/sites/prod/.venv27/local/lib/python2.7/site-packages/elmyra/ip/access/epo/services/dpma.py", line 98, in dpma_published_data_search
    return depatisnet.search_patents(query, hits_per_page)
  File "/opt/elmyra/patentsearch/sites/prod/.venv27/local/lib/python2.7/site-packages/elmyra/ip/access/dpma/depatisnet.py", line 123, in search_patents
    results = self.read_xls_response(xls_response)
  File "/opt/elmyra/patentsearch/sites/prod/.venv27/local/lib/python2.7/site-packages/elmyra/ip/access/dpma/depatisnet.py", line 141, in read_xls_response
    data = excel_to_dict(xls_response.read())
  File "/opt/elmyra/patentsearch/sites/prod/.venv27/local/lib/python2.7/site-packages/elmyra/ip/access/dpma/depatisnet.py", line 174, in excel_to_dict
    if u'Search query' in sheet.cell(0, 0).value:
  File "/opt/elmyra/patentsearch/sites/prod/.venv27/local/lib/python2.7/site-packages/xlrd/sheet.py", line 399, in cell
    self._cell_types[rowx][colx],
IndexError: list index out of range

Another one:

2015-09-25 10:00:02,117 ERROR [elmyra.ip.access.epo.services.dpma][MainThread] DEPATISnet search error: query="((Bi=Greife? or Bi=Grip?) and (Bi=rohr or Bi=tube or Bi=circular)) and (pc=DE or pc=EP) and (IC=B26D? or IC=B23D?)", reason=HTTP Error 500: Internal Server Error, Exception was:
Traceback (most recent call last):
  File "/opt/elmyra/patentsearch/sites/prod/.venv27/local/lib/python2.7/site-packages/elmyra/ip/access/epo/services/dpma.py", line 61, in depatisnet_published_data_search_handler
    return dpma_published_data_search(query, request_size)
  File "/opt/elmyra/patentsearch/sites/prod/.venv27/local/lib/python2.7/site-packages/beaker/cache.py", line 585, in cached
    return cache[0].get_value(cache_key, createfunc=go)
  File "/opt/elmyra/patentsearch/sites/prod/.venv27/local/lib/python2.7/site-packages/beaker/cache.py", line 317, in get
    return self._get_value(key, **kw).get_value()
  File "/opt/elmyra/patentsearch/sites/prod/.venv27/local/lib/python2.7/site-packages/beaker/container.py", line 378, in get_value
    v = self.createfunc()
  File "/opt/elmyra/patentsearch/sites/prod/.venv27/local/lib/python2.7/site-packages/beaker/cache.py", line 583, in go
    return func(*args)
  File "/opt/elmyra/patentsearch/sites/prod/.venv27/local/lib/python2.7/site-packages/elmyra/ip/access/epo/services/dpma.py", line 98, in dpma_published_data_search
    return depatisnet.search_patents(query, hits_per_page)
  File "/opt/elmyra/patentsearch/sites/prod/.venv27/local/lib/python2.7/site-packages/elmyra/ip/access/dpma/depatisnet.py", line 122, in search_patents
    xls_response = self.browser.open(self.xlsurl)
  File "/opt/elmyra/patentsearch/sites/prod/.venv27/local/lib/python2.7/site-packages/mechanize/_mechanize.py", line 203, in open
    return self._mech_open(url, data, timeout=timeout)
  File "/opt/elmyra/patentsearch/sites/prod/.venv27/local/lib/python2.7/site-packages/mechanize/_mechanize.py", line 255, in _mech_open
    raise response
httperror_seek_wrapper: HTTP Error 500: Internal Server Error

2015-09-29 [gas]

Die Abfrage nach B29C 70/38? für DE und EP within 2000,2015 ergab gestern 974 Treffer. Heute sind es 982! Wie kann ich die zusätzlichen herausfinden?