# definitions - wired into processor # $S, $NameStart, $NameC, $Hex - per XML spec # State Statename Fallbackstate {errormessage} # Start State Start InProlog {in prolog} T < LTatStart !MarkLt T $S InProlog State LTatStart BustedProlog {} T ? Push(TryXMLD,InProlog) !ColdStart T ! MDOinProlog T $NameStart StagGI !HotStart State BustedProlog InProlog {} T > InProlog T $. BustedProlog State InProlog BustedProlog {} T < LTinProlog !MarkLt T $S InProlog State AfterDTD AfterDTD {after DTD} T < LTafterDTD !MarkLt T $S AfterDTD # saw < in prolog State LTinProlog LTinProlog {after <} T ? Push(InPI,InProlog) !ColdStart T ! MDOinProlog T $NameStart StagGI !HotStart # saw < after DTD in prolog State LTafterDTD AfterDTD {} T ? Push(InPI,AfterDTD) !ColdStart T ! MDOafterDTD T $NameStart StagGI !HotStart # saw < in doc State SawLT BustedMarkup {} T ? Push(InPI,InDoc) !ColdStart T ! MDO T / ETAGO T $NameStart StagGI !HotStart # saw < in internal subset State LTInSubset BustedDecl {} T ! MDOInSubset T ? Push(InPI,InSubset) !ColdStart # saw in PIC State PICMustHaveGT BustedMarkup {} T > Pop() !ReportPI # want > in PIC State PICwantsGT BustedMarkup {} T > Pop() !ReportPI T ? PICwantsGT T $. PI4 # saw Pop() !ColdStart # saw -- in comment State DDInPComment InComment {} T > Pop() State InSubsetCom BustedDecl {} T - DashInSubsetCom T $. InSubsetCom State DashInSubsetCom InSubsetCom {} T - DoubleDashInSubsetCom T $. InSubsetCom State DoubleDashInSubsetCom InSubsetCom {} T > InSubset # saw InDoc !ReportText !ColdStart T $. InCData !SaveExtra(]]) # after AfterDTD T $S SawDTName # AfterDTD !ReportDoctype T [ InSubset !ReportDoctype T $S SawDTypeExternalID # saw [ for internal subset State InSubset InSubset {in internal subset} T < LTInSubset !MarkLt T % Push(PERef,InSubset) !MarkPcAtTopLevel !ColdStart T ] RSBInSubset T $S InSubset State RSBInSubset InSubset {} T ] RSB2InSubset T > AfterDTD T $S AfterSubset State RSB2InSubset InSubset {} T > InSubset !LeaveMarkedSect # in PE Ref State PERef InSubset {in PE reference} T $NameStart PERef2 State PERef2 InSubset {} T $NameC PERef2 T ; Pop() !ReportReference # After subset State AfterSubset InDoc {after internal subset} T > AfterDTD T $S AfterSubset # InSubset !DeclareNotation T $S NDecGotPubID State NDecGotPubID BustedDecl {} T > InSubset !DeclareNotation T ' Push(SQCData,NDecSawSysID) !ColdStart T " Push(DQCData,NDecSawSysID) !ColdStart # InSubset !DeclareNotation State NDecAllDone BustedDecl {} T > InSubset !DeclareNotation # Trying for InSubset !EndGI !ReportAttDecl !ReportAttlist # Base Attlist state State InAttlist BustedDecl {in InSubset !ReportAttlist T % Push(PERef,InAttlist) !MarkPC !ColdStart T $NameStart InAttName !HotStart T $S InAttlist # in attr name State InAttName BustedDecl {in attribute name} T $NameC InAttName T $S AfterAttName !EndAttribute # after attr name # We don't really care about the type, we'll just scoop it up as a string State AfterAttName BustedDecl {after attribute name} T ( StartTokenList !HotStart T C Keyword(CDATA,AttrTypeCDATA,1) !HotStart T E AttrTypeE !HotStart T I AttrTypeI !HotStart T N AttrTypeN !HotStart T % Push(PERef,AfterAttName) !MarkPC !ColdStart T $S AfterAttName # Token list State StartTokenList BustedDecl {in tokenized attribute value} T % Push(PERef,StartTokenList) !MarkPC !ColdStart T $S StartTokenList T $NameC FirstToken State FirstToken BustedDecl {} T $NameC FirstToken T $S AfterToken T | BeforeToken T ) AfterTokList !EndSave State AfterToken BustedDecl {} T % Push(PERef,AfterToken) !MarkPC !ColdStart T $S AfterToken T | BeforeToken T ) AfterTokList !EndSave State BeforeToken BustedDecl {} T % Push(PERef,BeforeToken) !MarkPC !ColdStart T $S BeforeToken T $NameC InToken State InToken BustedDecl {} T $NameC InToken T $S AfterToken T | BeforeToken T ) AfterTokList !EndSave State AfterTokList BustedDecl {} T $S AfterAttrType State AttrTypeCDATA BustedDecl {after CDATA} T $S AfterAttrType !EndSave State AttrTypeE BustedDecl {in ENTIT(Y,IES)} T N AttrTypeEn State AttrTypeEn BustedDecl {} T T AttrTypeEnt State AttrTypeEnt BustedDecl {} T I AttrTypeEnti State AttrTypeEnti BustedDecl {} T T AttrTypeEntit State AttrTypeEntit BustedDecl {} T I AttrTypeEntiti T Y AttrTypeEntity State AttrTypeEntity BustedDecl {} T $S AfterAttrType !EndSave State AttrTypeEntiti BustedDecl {} T E AttrTypeEntitie State AttrTypeEntitie BustedDecl {} T S AttrTypeEntities State AttrTypeEntities BustedDecl {} T $S AfterAttrType !EndSave State AttrTypeI BustedDecl {in ID(REF(S))} T D AttrTypeId State AttrTypeId BustedDecl {} T $S AfterAttrType !EndSave T R AttrTypeIdr State AttrTypeIdr BustedDecl {} T E AttrTypeIdre State AttrTypeIdre BustedDecl {} T F AttrTypeIdref State AttrTypeIdref BustedDecl {} T $S AfterAttrType !EndSave T S AttrTypeIdrefs State AttrTypeIdrefs BustedDecl {} T $S AfterAttrType !EndSave State AttrTypeN BustedDecl {in type keyword} T O Keyword(NOTATION,AttrTypeNotation,2) T M Keyword(NMTOKEN,AttrTypeNmtoken,2) State AttrTypeNotation BustedDecl {after NOTATION} T $S BeforeNotList # notation list State BeforeNotList BustedDecl {} T % Push(PERef,BeforeNotList) !MarkPC !ColdStart T $S BeforeNotList T ( StartNotList State StartNotList BustedDecl {in NOTATION list} T % Push(PERef,StartNotList) !MarkPC !ColdStart T $S StartNotList T $NameStart FirstNotName State FirstNotName BustedDecl {} T $NameC FirstNotName T $S AfterNotName T | BeforeNotName T ) AfterAttrType !EndSave State AfterNotName BustedDecl {} T % Push(PERef,AfterNotName) !MarkPC !ColdStart T $S AfterNotName T | BeforeNotName T ) AfterAttrType !EndSave State BeforeNotName BustedDecl {} T % Push(PERef,BeforeNotName) !MarkPC !ColdStart T $S BeforeNotName T $NameStart InNotName State InNotName BustedDecl {} T $NameC InNotName T $S AfterNotName T | BeforeNotName T ) AfterAttrType !EndSave State AttrTypeNmtoken BustedDecl {after NMTOKEN} T $S AfterAttrType !EndSave T S AttrTypeNmtokens State AttrTypeNmtokens BustedDecl {} T $S AfterAttrType !EndSave #after attrtype # NB, may need ColdStart on '#' transition State AfterAttrType BustedDecl {after attribute type} T % Push(PERef,AfterAttrType) !MarkPC !ColdStart T $S AfterAttrType T # AttrDefHash !ColdStart T " Push(DQADef,InAttlist) !ColdStart !EnterAttrVal T ' Push(SQADef,InAttlist) !ColdStart !EnterAttrVal State AttrDefHash BustedDecl {in attribute default} T R Keyword(REQUIRED,AttrDefRequired,1) T I Keyword(IMPLIED,AttrDefImplied,1) T F Keyword(FIXED,AttrDefFixed,1) # #required & #implied are static to us State AttrDefRequired BustedDecl {after #REQUIRED} T $S InAttlist !ReportDefKW !ReportAttDecl T > InSubset !ReportDefKW !ReportAttDecl !ReportAttlist State AttrDefImplied BustedDecl {after #IMPLIED} T $S InAttlist !ReportDefKW !ReportAttDecl T > InSubset !ReportDefKW !ReportAttDecl !ReportAttlist State AttrDefFixed BustedDecl {after #FIXED} T $S AfterFixed !ReportDefKW State AfterFixed BustedDecl {} T % Push(PERef,AfterFixed) !MarkPC !ColdStart T $S AfterFixed T ' Push(SQADef,InAttlist) !ColdStart !EnterAttrVal T " Push(DQADef,InAttlist) !ColdStart !EnterAttrVal # saw InSubset !DoEMPTY State AfterANY BustedDecl {after ANY} T $S AfterANY T > InSubset !DoANY # InSubset # Really done elmeent content, no repeat count State ReallyDoneEC BustedDecl {} T % Push(PERef,ReallyDoneEC) !MarkPC !ColdStart T $S ReallyDoneEC T > InSubset # Saw ( of a (non-top-level) CP State CPStart BustedMarkup {start of content particle} T % Push(PERef,CPStart) !MarkPC !ColdStart T ( Push(CPStart,AfterFirst) !StartCP T $S CPStart T $NameC FirstType !HotStart # gather lead-off child type State FirstType BustedDecl {} T $NameC FirstType T $S DoneFirst !EndGI T * DoneFirst !EndGI !MarkRep T ? DoneFirst !EndGI !MarkRep T + DoneFirst !EndGI !MarkRep T ) PopCP() !EndGI T , SeqNeedNext !EndGI !MarkConnector T | ChoiceNeedNext !EndGI !MarkConnector # have seen first child () of content particle State AfterFirst BustedMarkup {after first member of content particle} T $S DoneFirst T , SeqNeedNext !MarkConnector T | ChoiceNeedNext !MarkConnector T * DoneFirst !MarkRep T + DoneFirst !MarkRep T ? DoneFirst !MarkRep T ) PopCP() # finshed with first member of cp State DoneFirst BustedMarkup {} T % Push(PERef,DoneFirst) !MarkPC !ColdStart T $S DoneFirst T , SeqNeedNext !MarkConnector T | ChoiceNeedNext !MarkConnector T ) PopCP() # after connector in a seq State SeqNeedNext BustedMarkup {after ,} T % Push(PERef,SeqNeedNext) !MarkPC !ColdStart T $S SeqNeedNext T $NameStart SeqType !HotStart T ( Push(CPStart,AfterSeqMember) !StartCP # child type in a sequence State SeqType BustedMarkup {} T $NameC SeqType T $S DoneSeqMember !EndGI T * DoneSeqMember !EndGI !MarkRep T + DoneSeqMember !EndGI !MarkRep T ? DoneSeqMember !EndGI !MarkRep T ) PopCP() !EndGI T , SeqNeedNext !EndGI # after () child in a seq State AfterSeqMember BustedMarkup {after )} T $S DoneSeqMember T , SeqNeedNext T * DoneSeqMember !MarkRep T ? DoneSeqMember !MarkRep T + DoneSeqMember !MarkRep T ) PopCP() # all done with member of a seq State DoneSeqMember BustedMarkup {after member of sequence} T % Push(PERef,DoneSeqMember) !MarkPC !ColdStart T $S DoneSeqMember T , SeqNeedNext T ) PopCP() # after connector in a choice State ChoiceNeedNext BustedMarkup {after |} T % Push(PERef,ChoiceNeedNext) !MarkPC !ColdStart T $S ChoiceNeedNext T $NameStart ChoiceType !HotStart T ( Push(CPStart,AfterChoiceMember) !StartCP # child type in a choice State ChoiceType BustedMarkup {} T $NameC ChoiceType T $S DoneChoiceMember !EndGI T * DoneChoiceMember !EndGI !MarkRep T + DoneChoiceMember !EndGI !MarkRep T ? DoneChoiceMember !EndGI !MarkRep T ) PopCP() !EndGI T | ChoiceNeedNext !EndGI # after () child in a Choice State AfterChoiceMember BustedMarkup {after )} T $S DoneChoiceMember T | ChoiceNeedNext T * DoneChoiceMember !MarkRep T + DoneChoiceMember !MarkRep T ? DoneChoiceMember !MarkRep T ) PopCP() # all done with member of a choice State DoneChoiceMember BustedMarkup {after member of choice} T % Push(PERef,DoneChoiceMember) !MarkPC !ColdStart T $S DoneChoiceMember T | ChoiceNeedNext T ) PopCP() # Saw (#PCDATA State AfterPCDATA BustedDecl {after #PCDATA} T % Push(PERef,AfterPCDATA) !MarkPC !ColdStart T $S AfterPCDATA T ) SimpleMixed T | MixedNeedGI # Saw (#PCDATA| State MixedNeedGI BustedDecl {after (#PCDATA|} T $S MixedNeedGI T % Push(PERef,MixedNeedGI) !MarkPC !ColdStart T $NameStart MixedGI !HotStart State MixedGI BustedDecl {in element type} T $NameC MixedGI T $S MixedAfterGI !EndGI T | MixedNeedGI !EndGI T ) AfterMixed !EndGI State MixedAfterGI BustedDecl {after element type} T $S MixedAfterGI T % Push(PERef,MixedAfterGI) !MarkPC !ColdStart T | MixedNeedGI T ) AfterMixed State AfterMixed BustedDecl {after mixed declaration} T * DoneMixed !DoMixed State DoneMixed BustedDecl {} T % Push(PERef,DoneMixed) !MarkPC !ColdStart T $S DoneMixed T > InSubset # Saw (#PCDATA) State SimpleMixed BustedDecl {after (#PCDATA)} T * DoneMixed !DoMixed T $S DoneMixed !DoMixed T > InSubset !DoMixed State AfterEntityKW BustedDecl {after InSubset !ReportInternalEntity T $S AfterEVal # InSubset !ReportSystemTextEntity State DoneEntExternalID BustedDecl {} T % Push(PERef,DoneEntExternalID) !MarkPC !ColdStart T N Keyword(NDATA,AfterNDATA,1) T > InSubset !ReportSystemTextEntity T $S DoneEntExternalID # NDATA State AfterNDATA BustedDecl {after NDATA} T $S SawNDATA # Saw NDATA State SawNDATA BustedDecl {} T % Push(PERef,SawNDATA) !MarkPC !ColdStart T $NameStart InNotationName !HotStart T $S SawNDATA # Reading notation name after NDATA State InNotationName BustedDecl {} T $NameC InNotationName T $S AfterNDATADecl T > InSubset !EndSave !DeclareUnparsed State AfterNDATADecl BustedDecl {after NDATA declaration} T $S AfterNDATADecl T > InSubset !EndSave !DeclareUnparsed # saw InDoc !EndGI !ReportETag T $NameC EtagGI T $S SawEtagGI !EndGI #10 S after GI in end-tag State SawEtagGI BustedMarkup {} T > InDoc !ReportETag T $S SawEtagGI # in Start tag GI State StagGI BustedMarkup {in element type} T $NameC StagGI T $S InStag !EndGI T > InDoc !EndGI !ReportSTag T / EmptyClose !EndGI State EmptyClose BustedMarkup {after / in start-tag} T > InDoc !ReportEmpty # in tag after gi State InStag BustedMarkup {in start-tag} T > InDoc !ReportSTag T / EmptyClose T $S InStag T $NameStart AttrName !HotStart # Attr name State AttrName BustedMarkup {} T = Eq2 !EndAttribute T $NameC AttrName T $S Eq1 !EndAttribute # ' ' before = State Eq1 BustedMarkup {} T = Eq2 T $S Eq1 # = in Eq State Eq2 BustedMarkup {after AttrName= in start-tag} T ' Push(SQAVal,InStag) !ColdStart !EnterAttrVal T " Push(DQAVal,InStag) !ColdStart !EnterAttrVal T $S Eq2 # in AttrValue, '-delimited State SQAVal BustedMarkup {} T < SQAVal !MarkLt T ' Pop() !EndAttrVal T & Push(SawAmp,SQAVal) !ReportText !MarkAmp !ColdStart T $. SQAVal # in Default AttrValue, '-delimited State SQADef BustedDecl {} T ' Pop() !EndAttrVal !ReportAttDecl T & Push(SawAmp,SQAVal) !ReportText !MarkAmp !ColdStart T $. SQADef # in AttrValue, "-delimited State DQAVal BustedMarkup {} T < DQAVal !MarkLt T " Pop() !EndAttrVal T & Push(SawAmp,DQAVal) !ReportText !MarkAmp !ColdStart T $. DQAVal # in Default Attrval State DQADef BustedDecl {} T " Pop() !EndAttrVal !ReportAttDecl T & Push(SawAmp,DQAVal) !ReportText !MarkAmp !ColdStart T $. DQADef # scanning in PCData State InDoc InDoc {in character data} T < SawLT !ReportText !MarkLt T & Push(SawAmp,InDoc) !ReportText !MarkAmp !ColdStart T $. InDoc T ] RSBInDoc # ] in doc State RSBInDoc InDoc {} T < SawLT !ReportText !MarkLt T & Push(SawAmp,InDoc) !ReportText !MarkAmp !ColdStart T ] RSB2InDoc T $. InDoc # ]] in doc State RSB2InDoc InDoc {} T < SawLT !ReportText !MarkLt T & Push(SawAmp,InDoc) !ReportText !MarkAmp !ColdStart T > InDoc !FloatingMSE T ] RSB2InDoc T $. InDoc # After root element State AfterRoot AfterRoot {after end of document} T $S AfterRoot T < LTAfterRoot State LTAfterRoot BustedMarkup {< after document} T ? Push(InPI,AfterRoot) !ColdStart T ! ComStartAfter State ComStartAfter BustedMarkup {} T - Push(COMStartHalf,AfterRoot) State SawAmp BustedEntity {after &} T a AmpA T g AmpG T l AmpL T q AmpQ T # AmpHash T $NameStart EntBody # reading numeric char refs State AmpHash BustedEntity {in &# reference} T 0123456789 AmpHash T x HexRef T ; Pop() !HashRef # hex char refs State HexRef BustedEntity {in &#x reference} T $Hex HexRef T ; Pop() !HashRef State BustedEntity InDoc {} T ; Pop() !ColdStart T $. BustedEntity State EntBody BustedEntity {in entity name} T ; Pop() !ReportReference T $NameC EntBody State AmpG BustedEntity {in entity reference} T t AmpGt T $NameC EntBody State AmpGt BustedEntity {} T ; Pop() !CharRef(>) T $NameC EntBody State AmpL BustedEntity {} T t AmpLt T $NameC EntBody State AmpLt BustedEntity {} T ; Pop() !CharRef(<) T $NameC EntBody State AmpQ BustedEntity {} T u AmpQu T $NameC EntBody State AmpQu BustedEntity {} T o AmpQuo T $NameC EntBody State AmpQuo BustedEntity {} T t AmpQuot T $NameC EntBody State AmpQuot BustedEntity {} T ; Pop() !CharRef(") T $NameC EntBody State AmpA BustedEntity {} T m AmpAm T p AmpAp T $NameC EntBody State AmpAp BustedEntity {} T o AmpApo T $NameC EntBody State AmpApo BustedEntity {} T s AmpApos T $NameC EntBody State AmpApos BustedEntity {} T ; Pop() !CharRef(') T $NameC EntBody State AmpAm BustedEntity {} T p AmpAmp T $NameC EntBody StateAmpAmp BustedEntity {} T ; Pop() !CharRef(&) T $NameC EntBody # Breakage in markup - wait for ">" State BustedMarkup InDoc {} T > InDoc !ColdStart T $. BustedMarkup # Breakage in declaration in subset - wait for '>' State BustedDecl InDoc {} T > InSubset T $. BustedDecl # XML declaration is: # XMLDecl ::= '' # TextDecl ::= '' # VersionInfo ::= S 'version' Eq ('"VersionNum"' | "'VersionNum'") # Eq ::= S? '=' S? # VersionNum ::= ([a-zA-Z0-9_.:] | '-')+ # EncodingDecl ::= S 'encoding' Eq '"' EncName '"' | "'" EncName "'" # EncName ::= [A-Za-z] ([A-Za-z0-9._] | '-')* # SDDecl ::= S 'standalone' Eq "'" ('yes' | 'no') "'" # | S 'standalone' Eq '"' ('yes' | 'no') '"' # we start after the Pop() !GotXMLD