Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error generating epicov data #523

Closed
GopiGugan opened this issue Apr 28, 2024 · 4 comments
Closed

Error generating epicov data #523

GopiGugan opened this issue Apr 28, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@GopiGugan
Copy link
Contributor

[covizu@BEVi ~]$ tail batch.log
...
🏄 [12:20:25.736335] start BQ.1.1, 73245 entries
🏄 [12:27:05.707608] Parsing output files
Failed to retrieve metadata for accession EPI_ISL_18554387

clusters = json.load(infile)
for cluster in clusters:
for variant, samples in cluster['nodes'].items():
revised = []
for coldate, accn, location, name in samples:
md = metadata.get(accn, None)
if md is None:
print("Failed to retrieve metadata for accession {}".format(accn))
sys.exit()
revised.append([name, accn, location, coldate, md['gender'], md['age'], md['status']])
# replace list of samples
cluster['nodes'][variant] = revised
return clusters

@GopiGugan GopiGugan added the bug Something isn't working label Apr 28, 2024
@ArtPoon
Copy link
Contributor

ArtPoon commented Apr 30, 2024

Can we grep for this accession number in that provision file?

@GopiGugan
Copy link
Contributor Author

This accession number existed in a previous provision file, but not in the latest file. Investigating how this accession number is showing up in the current clusters.json file

@ArtPoon
Copy link
Contributor

ArtPoon commented May 2, 2024

Sometimes sequences are retracted in the database so the record that appeared in a previous provision file would no longer appear in subsequent files.

@GopiGugan
Copy link
Contributor Author

covizu/batch.py

Lines 320 to 338 in 22f4f4c

if args.use_db:
# Insert all updated records into the database
for record in result:
cur.execute('''
INSERT INTO CLUSTERS
VALUES (%s, %s)
ON CONFLICT (lineage) DO UPDATE
SET cluster_data = %s
''', [record['lineage'], json.dumps(record), json.dumps(record)])
# Retrieve cluster data for other lineages from the database
for lineage, _ in by_lineage.items():
if lineage not in updated_lineages:
cur.execute("SELECT cluster_data FROM CLUSTERS WHERE lineage = '%s'"%lineage)
cluster_info = cur.fetchone()
if cluster_info is None:
cb.callback("Missing CLUSTERS record for lineage {}".format(lineage), level='ERROR')
sys.exit()
result.append(cluster_info['cluster_data'])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants