Could not subset
AtoanyFierro opened this issue · comments
I tried to run this lines:
These lines come from the previous code
ann = dann.data
del phen
print(f'# total initial entities: {len(ann)}')
Keep only disorders
for dis,val in list(ann.items()):
if val['group'] != 'Disorder':
del ann[dis]
print(f'# disases: {len(ann)}')
Keep only those with phenotypic information
for dis,val in list(ann.items()):
if not val.get('phenotype'):
del ann[dis]
print(f'# disases with phenotype data: {len(ann)}')
Remove clinial syndromes
for dis,val in list(ann.items()):
if val['type'].lower() == 'clinical syndrome':
del ann[dis]
print(f'# diseases w/o clinical syndromes: {len(ann)}')
Keep only selected prevalences
valid_prev = ['>1 / 1000', '6-9 / 10 000', '1-5 / 10 000', '1-9 / 100 000', 'Unknown', 'Not yet documented']
for dis, val in list(ann.items()):
if 'prevalence' in val:
classes = [a['class'] for a in val['prevalence'] if a['type'] == 'Point prevalence']
if not any(x in valid_prev for x in classes):
del ann[dis]
else:
del ann[dis]
print(f'# disases with valid prevalence: {len(ann)}')
and I get this:
total initial entities: 12082
KeyError Traceback (most recent call last)
in
6 ## Keep only disorders
7 for dis,val in list(ann.items()):
----> 8 if val['group'] != 'Disorder':
9 del ann[dis]
10 print(f'# disases: {len(ann)}')
KeyError: 'group'
How can I run these lines in order to perform a subset?
I am closing this as #15 should solve it