Text Generation in MedView Applications
Guide written by Fredrik Lindahl, last updated
November 22, 2003
Introduction
This guide describes what rules and definitions exist in the MedView applications
for text generation. In the following, the character '\n
' defines
the new-line character.
Definitions
- Line enders are characters that the generator engine recognizes as
enders of a line. Some examples of line enders in the current engine are:
. ? ! \n
. Note that the newline character is defined as a line
ender.
- Colon strings currently consist of the following characters:
: -
.
- Separators currently consist of only the comma character '
,
'.
- A line starts at the first character that is not whitespace or a
line ender character, and ends at the first occurence of a line-ender character.
For instance, the line "
A camel can walk in the desert.
"
in the following piece of text "\n\nA camel can walk in the desert.
The sun shines on the cactus.
" starts with the 'A
'
and ends at the dot after "desert
".
- A title line is a line that begins on a new line and ends with a
new line (and not a line ender, such as '.' or '!'). The first line in the
document is considered to be started with a new line. Some examples of title
lines: "
\nAnamnes\n
", "\n\n\nSensitivity\n\n
",
"Some title\n
", where the last example starts at document
offset 0 (i.e. the start of the document). Some examples of non title lines:
"\nAnamnes.\n
", "a previous line. The line
to consider.
". Title lines have special properties, and are connected
with the content following them until the next title line occurs. If all connected
content is removed, the title line will be removed as well.
- A significant line is a line that contains significant, non-derived
content based on information in a medical record. When a section contains
at least one significant line, it will be included in the generated output.
For instance, if the following line exists in the template: "
The
patient bleeds: bleed.
", where bleed
is a term, and the selected medical record contains content for the bleed-term
that does not translate to noline, the line is significant.
- An invalid line is a line that must be removed from output at generation
time. An invalid line is produced when a line contains at least one term that
does not translate to anything, or if it translated to noline. For instance,
if the following line exists in the template: "
The patient bleeds:
bleed.
", where bleed is a
term, and the selected medical record contains a value resulting in a noline
translation, the line becomes invalid and is removed from the generated output.
- A potentially significant term is a term that is not derived from
another term. For instance, the terms
PCode(age)
,
PCode(female or male)
, and PCode(pid)
are examples of derived terms, i.e. their values depend on the value of another
term found in the medical record, namely the pcode-term. The following
terms are examples of non-derived terms: Occup
,
Born
, Smoke
.
A potentially significant term has the potential of creating a significant
line, i.e. if the term contains content that translates to a significant value
(i.e. not noline) the line that contains it becomes significant.
- A significant term is a potentially significant term that, at generation
time, is shown to have significant content (i.e. not a value that results
in a noline translation). A line containing a significant term and no other
invalid terms will be deemed significant at generation time.
Rules and the Generation Algorithm
The template is made up of a number of sections, each containing a number
of lines, each (optionally) containing a number of terms. When
producing the generated journal, each section that the user has chosen to include
is processed one after the other. During the processing of a section, various
rules decide whether or not the section has anything of interest, and thus whether
or not it should be included at all in the output journal.
Rule 1 - If a section contains only plain text, it is included. This
is so, because the creator of the template could only have produced such a section
for one purpose - namely to be included if chosen. There exists no other reason
to create such a section.
Rule 2 - If a section contains only plain text and derived terms, the
section is included.
Rule 3 - If a section contains at least one potentially significant
term, and at least one significant line is produced during generation, the section
is included.
Rule 4 - If a section contains at least one potentially significant
term, and no significant line was produced during generation, the section is
removed.
Rule 5 - If the content between two successive titles contain no valid
text, the first title is removed and the second title will begin where the first
began. For instance, in the following text: "DISEASES\n\nThe patient
suffers from disease.\n\nMEDICATION\n\nThe patient
takes drug. Side effects that the patient recognizes
are side-effects.
", with the title
lines "DISEASES" and "MEDICATION", if the term disease
produces a no-line translation, the line containing it will become invalid.
Thus, according to the rule, the line and the title containing it are removed,
and the text will become "MEDICATION\n\nThe patient takes drug.
Side effects that the patient recognizes are side-effects.
".
Rule 6 - Removing a line that is preceeded by at least two new line
characters and succeeded with at least one new line character will cause the
succeeding line to begin where the removed line started.
Rule 7 - Removing a line that is preceeded by at least one new line
character and succeeded with another line (not a new line character) will cause
the succeeding line to begin where the removed line started.
Rule 8 - Removing a line that is preceeded by another line, and succeeded
with at least one new line character will make the paragraph end with the preceeding
line.
The generation algorithm:
For each section S:
Define a flag foundAtLeastOneNonDerivedChild
for this section, initially false.
Define a flag foundSignificantLine for this
section, initially false.
Define a set T for this section, to contain all found
title lines, initially empty.
For each line L in S:
If L is a title line, add L to the set T.
Check the number of terms contained in the line:
If the line contains no terms, it is a valid
but insignificant line.
If the line contains terms, but all the terms
are derived, the line is also a valid but insignificant line.
If the line contains terms, and at least one
of the terms is a potentially significant term, the foundAtLeastOneNonDerivedChild
flag is set to true.
If the line contains at least one significant term,
and no invalid terms, it is deemed as a significant line, and the foundSignificantLine
flag is set to true. Also, the latest addition to the title set T is tagged
for inclusion.
If the line is valid and not a title line, tag
the latest addition to the title set T for inclusion.
If the line is invalid, remove it.
Remove all elements of the set T that are not tagged
for inclusion.
If the foundAtLeastOneNonDerivedChild flag
is still false, the section is included. If the foundAtLeastOneNonDerivedChild
flag is true, but the foundSignificantLine flag is still false, the
section is removed.
Remove all occurences of two or more whitespace characters
in succession.
Remove all occurences of two or more line-enders (excluding
newline) in succession.
Some comments:
- A section containing only plain text (no terms at all), or containing plain
text and only derived terms, will be included.
An Example
In the following section-divided text, lines with a greenish tone are title
lines and lines with a reddish tone will become significant due to the fact
that the following terms will produce a non-noline translation: Adv-drug,
Occup, Ref-in, Ref-cause, Drug, Symp-now, Time-dry, Dent-treat, HAD-A, Diag-nr,
Note17. Potentially significant terms are purplish, and derived terms
are blueish. The sections with a yellowish background will be removed. The result
after generation (without any values for the mentioned terms above though) is
showed below.
ID: PCode(pid) |
ANAMNES - Date(date)
Överkänslighet:
Adv-drug.
Blödningsbenägenhet: Bleed.
Allmänt
Pcode(age)-årig
Pcode(female or male) Occup
som Ref-in för Ref-cause.
Note01.
Allmän anamnes
Health. Checkup.
Symp-head. Besvär från genitala
slemhinnan: Genitals. Note02.
Aktuella sjukdomar: Dis-now.
Tidigare sjukdomar: Dis-past.
Hudsjukdomar: Skin-pbl.
Aktuell medicinering: Drug.
Allergier: Allergy.
Alkohol: Alcohol.
Snus: Snuff.
Rökning: Smoke.
Symptomrelaterade uppgifter
Patienten anger för närvarande Symp-now.
Besvären är lokaliserade till Symp-site.
Graderas till Vas-now enligt VAS. Patienten
rapporterar att besvären debuterade för cirka Symp-on
sedan. Patienten upplever att Symp-trigg
initierade symptomen. Besvären upplevs som Symp-var.
Durationen anges till Symp-dur med en frekvens
på Symp-freq. Besvären uppträder
Symp-24h. Uppger sig tidigare ha haft Symp-past
i regionen. Anger att dessa besvär enligt VAS var Vas-past.
Note03.
På frågan om vad som ökar symptomen
anger patienten Factor-neg. Däremot
så minskar symptomen vid Factor-pos.
Tidigare har Treat-pos givit positivt resultat.
I samband med att patienten erhållit Treat-neg
har en negativ påverkan upplevts. Även patientens Family
har haft liknande besvär. Note04.
Vävnadsförändring
Lesionen observerades första gången för Lesn-on
sedan. Lokaliserar själv lesionen till Lesn-site.
Lesn-var. Note05.
Muntorrhet
Vätska vid födointag: Water-meal.
Svårigheter att tala: Speech-pbl.
Duration av muntorrhet: Time-dry.
Andra symptom från munhålan: SS-exsym.
Spottkörtelsvullnad: SS-swollen.
Spottkörtelundersökning: Gland-exam.
Reumatisk sjukdom: SS-reum.
Heriditet reumatisk sjukdom: SS-reumfam.
Ögonsymptom: Eye-sand.
Duration av ögontorrhet: Eye-dry.
Ögondroppar: Eye-drops.
Ögonundersökning: Eye-exam.
Note06.
Tandläkarbehandling
Har genomgått Dent-treat
hos tandläkare.
Bettfysiologi
Bettfysiologiskt uppvisar patienten Symp-joint.
Symp-musc. Note07.
Tandvårdsrädsla
Patienten uppger Anx-amount rädsla vid
tandläkarbesök. Regelbunden tandvård Care-cont.
Fullständig tandvård för Care-past.
Senaste tandläkarbesöket upplevdes som Care-eval.
Note23. Patienten tror att rädslan i
första hand hänger samman med: Care-reason.
Anser sig varit Afraid-past inför tidigare
tandvård och tandläkare. Som barn upplevdes tandvården
som Care-exp. Patienten anger att behandlingar
ofta har varit Pain-past. Tror att rädsla
hos andra i omgivningen kan ha bidragit till den upplevda Fear-relat.
Särskilt anges rädsla hos Fear-rel.
Patienten upplever dessutom Fear-oktr för
att förlora kontrollen i samband med tandläkarbesök. Söker
just nu behandling pga Reason-treat. Det
känns Treat-fear för patienten
att få hjälp med sin rädsla. Patienten uppger att det
känns Treat-teeth att få behandling
av tänderna. Tror att möjligheten att bota rädslan är
Fear-treat. Motivation/engagemang för fobibehandling bedömer
patienten som Treat-life. Patienten tror
att behandlingsformer som Treat-suit kan
vara lämpliga. Anger särskilt att tandläkaren skall: Dent-patient.
Dent-exp. Dent-0pain.
Dent-accom. Dessutom vill patienten gärna
att Dent-adjust. Patienten upplever negativa
konsekvenser av tandvårdsrädslan i form av: Fear-0treat.
Tandvårdsrädslan ger upphov till komplikation när det
gäller kontakter med: Fear-fam. Fear-friend.
Fear-work. Andra negativa konsekvenser är:
Fear-varia. Känslor förknippade
med tandvårdsrädsla är: Fear-anger.
Fear-shame. Fear-avoid.
Fear-depr.
Psykometri
HAD-A: HAD-A.
HAD-A_m: HAD-A_m.
HAD-D: HAD-D. HAD-D_m: HAD-D_m.
DAS: DAS. DAS_m: DAS_m.
DFS: DFS. DFS_m: DFS_m.
GFSILL: GFSILL. GFSILL_m: GFSILL_m.
GFSEMB: GFSEMB. GFSEMB_m: GFSEMB_m.
GFSSOC: GFSSOC. GFSSOC_m: GFSSOC_m.
GFSFYS: GFSFYS. GFSFYS_m: GFSFYS_m.
GFSANI: GFSANI. GFSANI_m: GFSANI_m.
GFSMEAN: GFSMEAN. GFSMEAN_m: GFSMEAN_m.
GFSFOB: GFSFOB.
Note08.
|
STATUS - Date(date)
Direkt status
Patienten uppvisar en Mucos-colr slemhinna. Förändringen är
lokaliserad till Mucos-site. Reaktionsmönstret karaktäriseras
av Mucos-txtur. Storleken är cirka Mucos-size cm2. Note19. Palpation
av Palp-site. Palp-cons vid palpation. Palperas öm vid följande
muskler: Palp-musc. Palp-rel mot underlaget. Storleken bedöms vid
palpation till cirka Palp-size cm3. Note09. Patienten upplever en Sens-site
i regionen. Note11.
Bettfunktion
Occl-type. Patienten uppvisar Joint-dys. Interfer. Facetts. Note10. Note23.
Indirekt status
Röntgen: Xray-type tas mot Xray-site. Röntgen visar Xray-txtur.
Röntgen på aktuella tänder visar Xray-teeth. Note12.
Mikrobiologi
Mikrobiologiskt prov tas i Micro-site. I provet påvisas växt
av Micro-type. Note14.
Hematologi
Autoantikropper: Auto-Ab. B-HB = Blood-HGB g/L. B-WBC = Blood-WBC 109/L.
B-RBC = Blood-RBC 1012/L. B-PTL = Blood-PTL 109/L. S-B12 = Blood-Kob.
S-folat = Blood-Fol. fS-Järn = Blood-Fe. fB-Glukos = Blood-Glu. P-APTT
= Blood-APTT. P-PTK = Blood-PK. B-TSH = Blood-TSH. S-FrittT4 = Blood-T4.
Note15.
Salivvärden
Salivsekretion = Saliva-flow. Saliv-pH = Saliva-PH. Buffringskapacitet
= Saliva-buff. I sialografi syns Saliva-scin. Note16.
|
Histopatologisk undersökning
- Date(date)
Tecken på malignitet: Hist-malign
Bör kontrolleras regelbundet: Hist-control
Dysplasigrad: Hist-dyspl
Biopsi tagen i Biopsy-site.
Biopsinummer vid fler än en biopsi: Biopsy-numb.
Preparatet utgörs av Tiss-type.
Typ av undersökning: Histus-type.
Immunfluoroscens undersökningen visar: Tiss-fluor reaktionsmönster.
Focusscore Focus foci/cm2.
Den histopatologiska undersökningen visar: Note27.
|
TENTATIV DIAGNOS - Date(date)
Diag-tent.
|
HISTOPATOLOGISK DIAGNOS
- Date(date)
Diag-hist.
|
DIAGNOS - Date(date)
Diag-def.
Diagnos-nummer: Diag-nr.
Note17.
|
DAGANTECKNING - Date(date)
Besöket avser Vis-cause. Planerade åtgärder:
Plan-next. Exam-type. Kliniskt noteras att status objektivt är Treat-eval-obj.
Patienten upplever sig subjektivt Treat-eval-subj. Treat-type. Patienten
informeras om risken för Preop-inf. Anest-type. Anest-vol. Incis-type
i Biopsy-site. Tiss-type utdissikeras och överförs till transporteras
i Tiss-transp. Note31. Wound-treat. Sut-type. Sut-numb. Postop-inf. Note26.
Recept Treat-drug. Next-app. Note18.
|
---> generation produces --->
ID: 500530-4394
ANAMNES - 2003-04-15
Överkänslighet: Blommor.
Allmänt
53-årig manlig byggarbetare som självmant skrev in sig för
tandvärk.
Aktuell medicinering: Valium.
Symptomrelaterade uppgifter
Patienten anger för närvarande besvär i knäleden.
Muntorrhet
Duration av muntorrhet: 5.
Tandläkarbehandling
Har genomgått regelbunden behandling hos tandläkare.
Psykometri
HAD-A: 4.
DIAGNOS - Date(date)
Diagnos-nummer: 02318324. Patienten är rädd
för ormar.
|