New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NCBI Entrez eSearch RuntimeError: Invalid db name specified: nuccore #915
Comments
You've already tried This could be a temporary problem which might be fixed in a day or so, but if not I think you will need to email the NCBI Entrez help address - and then please update this GitHub issue. Thanks. |
Just a question: why are you looping with species through the list, and passing the same list each time as an argument? |
Good point Iddo - it looks like the script works due to a second error in the function definition, and Python's overly helpful scope rules. It should be: def ncbi_search(species):
# Do stuff
...
for species in species_list:
ncbi_search(species) |
This line: will have problems. |
I wonder why |
Good point Iddo. I will make that change, species_list to species. I just tried the script again and am getting the same error with both On Tue, Aug 23, 2016 at 7:55 AM, Sean notifications@github.com wrote:
|
Well, it won't run now "as is" simply because the variables have no values. |
Here is the code. Right now it will search a file I titled "Fish list.txt" that contains a list of common names and fish Genus species. It will take each Genus species and search NCBI for 16S and '"bony fishes"[porgn:__txid7898]' and will make a list of all the IDs found. It will then go through that list creating a new list. If the ID isn't in the new list it will download the .fasta I couldn't figure out how to attach the fish list.txt file so there is a list at the bottom. It hasn't failed on the download portion yet using efetch. It just fails on the esearch portion. There is a while True: in there to keep the script running when it fails, but an ERROR message will pop up. Let me know if you have any ideas. Thanks Damian
Green sturgeon_, Acipenser medirostris |
P.S. I've e-mailed NCBI and am waiting on a reply |
Are you sure you have enough fishies there? ;) The following worked. I used only the Linnean genus / species epithets, and I wrapped it in quotes (otherwise they become separate terms).
|
Thanks for the info on putting the Genus species names in quotes. That seemed to helped. Unfortunately, it didn't completely fix it. My list is around 500 fish long, it started throwing the same errors at fish number 323. I may just have to split up my fish list and do multiple runs. Thanks Iddo |
Some fish simply give nothing. Like Salmo pacificus, which has no 16S it Iddo Friedberg On Aug 23, 2016 5:53 PM, "Damian" notifications@github.com wrote:
|
Not sure if you ever heard back from ncbi on this, but I'm guessing it may be a general issue with Entrez. I'm scripting access to the 'gds' database (Geo Data Sets). And while it works most of the time, about 1 out of 20 attempts I get a "Invalid db name specified: gds" error. I'm debating modifying my script to just check for that and retry, but I'd like to find a better solution. |
Even on the sequence database side of Entrez, for non-trivial usage you will need retries - this is life with a networked resource. |
I have not heard back from NCBI yet. The address I e-mailed was I tried my script last Thursday 8/26 and it failed on every run. I tried On Sun, Aug 28, 2016 at 1:05 AM, Peter Cock notifications@github.com
|
I understand that servers might fail to respond, necessitating retries, but |
@peterjc I don't have a trace saved, but I remember the message being exactly the same as in this issue title: RuntimeError: Invalid db name specified: gds Which doesn't make sense coming from NCBI since gds definitely is one of the databases, and my queries work 95% of the time. Usually repeating the same esummary command exactly is enough to get a valid response. |
This is probably coming from https://github.com/biopython/biopython/blob/master/Bio/Entrez/Parser.py#L390 where It does sound like an intermittent problem with the NCBI GDS database not always responding via Entrez. |
Just tried again and am now getting a new error: Searching NCBI for Strongylocentrotus franciscanus and 12S and "sea urchins"[porgn:__txid7625]
|
@dmenning From the line numbering in the traceback for the exception you must have a very old release, probably Biopython 1.65? https://github.com/biopython/biopython/blob/biopython-165/Bio/Entrez/Parser.py#L513 The Our current release is Biopython 1.68, released last month. Could you update your copy of Biopython and re-test please? |
Done. It appears to be working now. I also got a reply from NCBI. It seems like they are having numerous issues. "Thank for reporting this issue. We have recently been experiencing issues with our E-Utilities services regarding db name; it is not unique to nuccore. We are currently investigating the issue to determine the root cause. In the meantime, we can only suggest that you continue to attempt your query several times, as the issue appears to be intermittent and random. As of now, we do not have a specific timeline for resolution of this issue." |
Thank you for passing on those details from the NCBI. Hopefully they can fix things on their side soon. Is there anything else you think we need to do on the Biopython side, or can we close this issue? Thanks! |
Everything appears to be functioning normally as before. I think this issue is closed. Thanks everyone for all of the help. Damian |
Hello everyone,
Two questions:
When NCBI phases out GI numbers, what will search_results["IdList"] return? Does the code need to be changed to get Accession.Version?
I am running a Python 2.7 script that has worked in the past but now is throwing an error. I am wondering if there has been a change recently that may be causing the problem.
My script searches NCBI using each Genus species name from a list (species), a general category like bony fish (cat), and a search term like 16S (add_term) and returns a list of ids. The script is below:
The error I am getting is:
Traceback (most recent call last):
File "D:\Python27\0 eDNA\02_GenBank_get_fasta_or_gb_no_repeats_1_search_criteria.py", line 98, in ncbi_search(species_list)
File "D:\Python27\0 eDNA\02_GenBank_get_fasta_or_gb_no_repeats_1_search_criteria.py", line 49, in ncbi_search search_results = Entrez.read(search_handle)
File "C:\Python27\lib\site-packages\Bio\Entrez__init__.py", line 376, in read record = handler.read(handle)
File "C:\Python27\lib\site-packages\Bio\Entrez\Parser.py", line 205, in read self.parser.ParseFile(handle)
File "C:\Python27\lib\site-packages\Bio\Entrez\Parser.py", line 343, in endElementHandler raise RuntimeError(value)
RuntimeError: Invalid db name specified: nuccore
I have tried changing the db to "nuccore" and I get the same error. I put in the while True: to keep the script running after the error. Any ideas on what is going wrong?
Thanks for the help
Damian
The text was updated successfully, but these errors were encountered: