10/03/2014 16:55 Local Title: TRANSFER OUT NOTE
Standard Title: TRANSFER SUMMARIZATION NOTE
AUTHOR: D,WARD
XYZ MEDICAL INSTITUTE
ABC NAGAR, PQW CITY-101011
******************************************************************
TRANSFER OUT NOTE
******************* OCT 03, 2014
UHID:000-01-0202 PATIENT NAME: NAME , SINGH
AGE/SEX:42/FEMALE
DOA:Sep 30,2014
DEPARTMENT:GYNAE AND OBSTETRICS UNIT:II
TRANSFERRED FROM:D3
NAME , SINGH 000-01-0202 DOB: 01/01/1972
TRANSFERRED TO : MCU
DIAGNOSIS:pop- em lscs with male baby nicu B
TREATMENT:
inj.cefazolin 1 gm bd
inj.rantac 1 amp tds
inj.perinorm 1 amp tds
inj.pcm 1 gm tds
inj.texid 1 gm tds
PATIENT STATUS AT THE TIME OF SHIFTING:
g.c. fair on iv fluid ..
NAME , SINGH 000-01-0202 DOB: 01/01/1972
VITALS AT THE TIME OF SHIFTING:
TEMP:98.6F
HR:88/MIN RR:24/MIN
GCS: E V M
< THE ABOVE NOTE IS UNSIGNED >
- DRAFT COPY * DRAFT COPY * DRAFT COPY * DRAFT COPY * DRAFT COPY * DRAFT COPY -
09/21/2014 23:01 Local Title: MED ONCO IRCH DISCHARGE SUMMARY
Standard Title: DISCHARGE SUMMARY
AUTHOR: KUMAR,UVW
LOCAL TITLE: MED ONCO IRCH DISCHARGE SUMMARY
STANDARD TITLE: DISCHARGE SUMMARY
NAME , SINGH 000-01-0202 DOB: 01/01/1972
DATE OF NOTE: SEP 21, 2014@22:04 ENTRY DATE: SEP 21, 2014@22:04:42
AUTHOR: UVW KUMAR
REGISTRATION DETAILS
********************
UHID No:000-01-0202 IRCH No:000222 CR No:111000
NAME: NAME AGE:22 YEAR GENDER:MALE
DOA:Sep 2, 2014 DOD:Sep 18, 2014 DURATION OF STAY: days
WARD: MRO Ward BED No:14
CONSULTANT INCHARGE:Dr UVW Kumar
DIAGNOSIS & REASON FOR CURRENT ADMISSION
****************************************
DIAGNOSIS:Acute Promyelocytic leukemia (Intermediate Risk)
ADMITTED FOR :Chemotherapy
CASE SUMMARY:NAME Singh presented with complaints of bleeding gums, fever,
NAME , SINGH 000-01-0202 DOB: 01/01/1972
blurring of vision and gum hypertrophy. He diagnosed as APML in PQW
hospital based on PS, BMA and PML/RARa positive. He started on ATRA and after
that reffered here. His basline hemorem at PQW Hospital was s/o Hb :
4.6, TLC: 1580/cu.mm, Platlet: 6000/cu.mm. So he is classified as
intermideate risk APML. After coming here diagnosis reconfirmed,
daunorubicin given 60mg/m2 and continoued on ATRA. No features of
ATRA syndrome noticed during ward stay. His fibrinogen level were > 450
mg/dl. He remained afebrile and hemodynamically stable and dischared on
stable condition.
PRESENTATION AT CURRENT ADMISSION
*********************************
VITAL SIGNS:
TEMP:99 F RESP:19/min PULSE:98/min
BP:121/78 mm of Hg SPO2:99% on RA
NAME , SINGH 000-01-0202 DOB: 01/01/1972
GENERAL PHYSICAL EXAMINATION: PERFORMANCE STATUS: I
PALLOR:+ ICTERUS:- OEDEMA:- CYANOSIS:-
STERNAL TENDERNESS:- CLUBBING:- GUM HYPERTROPHY:+
LYMPHNODES: -
BIOMETRIC DETAILS: WEIGHT: 45 kg HEIGHT:166 cms BSA: 1.4 m2
INVESTIGATIONS AT CURRENT ADMISSSION
************************************
PS (3/9/2014) : N2, L8, E-, M1, B-, Meta-, Myelo-, Blast 89%. Blast and abnormal
promyelocytes present. F/S/O Acute promyelocytic leukemia.
BMA (3/9/2014): Cellular BM shows 90% blast and abnormal promyelocyte. F/S/O
APML.
Flow Cytometery (3/9/2014): 87% abnormal promyelocyte, Positive : CD45, CD15,
NAME , SINGH 000-01-0202 DOB: 01/01/1972
CD11b, CD13, CD33, CD64, CD9, CD18, cMPO.
Negative for CD2, CD14, CD117, CD19, HLADR, CCD79a, cCD3.
Day 12 PS (9/9/2014): N78, L20, E-, M2, B-, Meta-, Myelo_ Promyelo Nil, Blast
Nil.
Condition at discharge:
VITAL SIGNS:
TEMP:99 F RESP:18/min PULSE:78/min
BP:112/74 mm of Hg SPO2:99% on RA
Plan At discharge and follow up: As written in OPD card
NAME , SINGH 000-01-0202 DOB: 01/01/1972
< THE ABOVE NOTE IS UNSIGNED >
- DRAFT COPY * DRAFT COPY * DRAFT COPY * DRAFT COPY * DRAFT COPY * DRAFT COPY -
09/21/2014 22:04 Local Title: MED ONCO IRCH DISCHARGE SUMMARY
Standard Title: DISCHARGE SUMMARY
AUTHOR: UVW,AMIT
REGISTRATION DETAILS
********************
UHID No:000-01-0202 IRCH No:000222 CR No:111000
NAME: NAME , SINGH AGE:42 GENDER:FEMALE
DOA:Sep 2, 2014 DOD:Sep 18, 2014 DURATION OF STAY: days
WARD: MRO Ward BED No:14
CONSULTANT INCHARGE:Dr Lalit Kumar
ADDRESS: ,
NAME , SINGH 000-01-0202 DOB: 01/01/1972
DIAGNOSIS & REASON FOR CURRENT ADMISSION
****************************************
DIAGNOSIS:
Acute Promyelocytic leukemia (Intermediate Risk)
ADMITTED FOR :Chemotherapy
CASE SUMMARY:NAME Singh presented with complaints of bleeding gums,
fever, blurring of vision and gum hypertrophy. He diagnosed as APML in
UVW hospital based on PS and PML/RARa positive. He started on ATRA and
after that reffered to XYZ hospital
PRESENTATION AT CURRENT ADMISSION
*********************************
VITAL SIGNS:
TEMP:F RESP:/min PULSE:/min
BP:/mm of Hg SPO2:%
NAME , SINGH 000-01-0202 DOB: 01/01/1972
GENERAL PHYSICAL EXAMINATION: PERFORMANCE STATUS:
PALLOR: ICTERUS: OEDEMA: CYANOSIS:
STERNAL TENDERNESS: CLUBBING: GUM HYPERTROPHY:
LYMPHNODES:
SPECIFIC FINDINGS:
BIOMETRIC DETAILS: WEIGHT:kgS HEIGHT:cms BSA: m2
INVESTIGATIONS AT CURRENT ADMISSSION
************************************
< THE ABOVE NOTE IS UNSIGNED >
- DRAFT COPY * DRAFT COPY * DRAFT COPY * DRAFT COPY * DRAFT COPY * DRAFT COPY -
NAME , SINGH 000-01-0202 DOB: 01/01/1972
这是我需要转换为CSV的文字内容。这是一名多次来医院的患者的详细信息。我想提取不同栏目中的医疗数据[年龄,性别,UHID,DOA,部门,诊断,治疗,患者状态,生命体征,地方名称,标准名称,病例摘要,入院,一般体检]。
正如您可以看到“诊断”的重复,并且列名称可能也会有所不同。
要处理的文件是15GB。
请建议解决问题的方法。我尝试使用python,openrefine和ctakes工具。
请介绍一下如何解决这类问题。限制是我们必须只使用开源免费工具。
答案 0 :(得分:1)
你可以用gawk做一些事情。像生命线和治疗这样的多线字段可能会变得难以制作成CSV格式,但这是单值字段的开始。
function dump() {
print age "," sex "," uhid "," doa "," dept "," diagnosis
}
BEGIN { onfirst = 1 }
END { dump() }
{
sub(/^ */, "")
sub(/UHID No/, "UHID")
}
match($0, /UHID:([^ ]*)/, a) {
if(onfirst)
onfirst = 0
else
dump()
uhid = a[1]
}
match($0, /AGE\/SEX:([0-9]*)\/(.*[^ ]) *$/, a) {
age = a[1]
sex = a[2]
}
match($0, /DOA:([^ ][^ ]* *[^ ][^ ]* *[^ ][^ ]*)/, a) {
doa = a[1]
}
match($0, /DEPARTMENT:(.*[^ ]) *UNIT/, a) {
dept = a[1]
}
match($0, /DIAGNOSIS:(.*[^ ]) *$/, a) {
diagnosis = a[1]
}