APPENDIX A. MLP TRAINING OUTPUT Explanation of the output produced during MLP training Whentheprogram mlpdoesatrainingrun,itwritesoutputtothe standarderrorandwritesthesameoutputtothe short_outfilespecifiedinthespecfile. Thepurposeofthisappendixisto explainthemeaningofthisoutput. Mlpproducessimilaroutput foratestingrunexceptthatthe"trainingprogress"partismissing. Pattern-Weights Asapreliminary,itwillbehelpfultodiscuss the"pattern-weights"which mlpuses,sincetheyareusedinthe calculationsofmanyofthevaluesshownintheoutput.Thepattern-weightsare "prior"weights,oneforeachpattern; 12theyremainconstant duringatraining(ortesting)run,althoughitispossibletodoa training"meta-run"thatisasequenceoftrainingrunsandto changethepattern-weightsbetweentheruns.Thesettingofthe pattern-weightsiscontrolledbythepriorsvaluesetinthe specfileandmaybeaffectedbyprovideddatafiles,as follows(inall cases,thedivisionby Nismerelya normalizationthatslightlyreduces theamountofcalculationneededlater): allsame:if priorsis allsame,theneachpattern-weightisset to(1/ N),whereNisthenumberofpatterns. class: afileof givenclass-weightsmustbesupplied;eachgivenclass-weightis dividedbytheactualclass-weightoftheinputdatasetandthe newclass-weightsarenormalizedsotheir sumis1.0.Theneachpattern-weightissettothenewclass-weight oftheclassofthecorrespondingpattern,dividedby N(numberof patterns).Theendresultisthatifthe actualdistributionof thedatasetdoesnotequalthatofthegivenclass-weights, class-weightsareadjustedsothefinalresultsapproximatewhat thescoreswouldbeifthe distributionwerethesameasthegiven class-weights.Iftheuserisonlyconcernedabouttheunadjusted scoreforthegivendata,setthegivenclass-weightsequaltothe actual class-weights. pattern:afileof(original)pattern-weightsmustbesupplied; eachofthemisdividedby Ntoproduce thecorrespondingpattern-weight. both: filesofclass-weightsand(original)pattern-weights mustbothbesupplied;eachpattern-weightisthensettothe class-weight(class-weightsareadjustedasdiscussedinthe class portionofthislist)oftheclassofthecorrespondingpattern, timesthecorresponding(original)pattern-weight,dividedbyN. Thepattern-weightsareusedinthecalculationoftheerrorvalue that mlpattemptstominimizeduringtraining.Whenthetraining patternsaresentthroughthenetwork,eachpatternproduces an errorcontribution,whichismultipliedbythepattern-weight forthatpatternbeforebeingaddedtoanerror accumulator(SectionA.1.1.2.2).Thepattern-weightsarealso involvedinthe calculationsofseveralotherquantitiesbesides theerrorvalue;alltheseusesaredescribedbelow. References[49]discusstheuseofclass-basedprior weights(Section5.4,pages10-11)which correspond totheclasssettingofpriors. 12Apatternisa feature-vector/classorfeature-vector/target-vectorpair Explanation of Output A.1.1.1 Header Thefirstpartoftheoutputisa"header"showingthespecfile parametervalues.Hereistheheaderofthe short_outfile test/pcasys/execs/mlp/mlp_dir/trn1.errproducedbythefirst trainingrunofasequenceof runsusedtotrainthefingerprintclassifier: Classifier MLP Training run Patterns file: fv1-9mlp.kls; using all 24300 patterns Final pattern-wts: made from provided class-wts and pattern-wts, files priors and patwts Error function: sum of squares Reg. factor: 2.000e+00 Activation fns. on hidden, output nodes: sinusoid, sinusoid Nos. of input, hidden, output nodes: 128, 128, 6 Boltzmann pruning, thresh. exp(-w^2/T), T 1.000e-05 Will use SCG Initial network weights: random, seed 12347 Final network weights will be written as file trn1.wts Stopping criteria (max. no. of iterations 50): (RMS err) <= 0.000e+00 OR (RMS g) <= 0.000e+00 * (RMS w) OR (RMS err) > 9.900e-01 * (RMS err 10 iters ago) OR (OK - NG count) < (count 10 iters ago) + 1.(OK level: 0.000) Long outfile: trn1l.err Given and Actual Prior Weights A => 0.036583 0.038025 L => 0.338497 0.319506 R => 0.316920 0.306584 S => 0.000000 0.005597 T => 0.029482 0.030123 W => 0.278518 0.300165 Given/Actual = New Prior Weights A -> 0.193897 L -> 0.213518 R -> 0.208333 S -> 0.000000 T -> 0.197247 W -> 0.187005 SCG: doing <= 50 iterations; 17286 variables. A.1.1.2 Training Progress Thenextpartoftheoutputlistsarunningupdateonthetraining progress.Thefirstfewlinesoftrainingprogressreportedare: pruned 80 6 86 C 1.67872e+05 H 2.40068e+04 R 85.70 M -0.00 T 0.0841 Iter Err ( Ep Ew) OK UNK NG OK UNK NG 0 0.474 (0.240 0.289) 6564 0 17736 = 27.0 0.0 73.0 % 0.0 0 4 19 0 0 70 pruned 108 3 111 C 1.75555e+05 H 2.54052e+04 R 85.53 M -0.00 T 0.0836 pruned 124 5 129 C 1.84026e+05 H 2.58204e+04 R 85.97 M -0.00 T 0.0824 pruned 129 6 135 C 2.20275e+05 H 2.72642e+04 R 87.62 M -0.00 T 0.0814 pruned 138 3 141 C 1.73226e+05 H 2.76075e+04 R 84.06 M -0.00 T 0.0803 pruned 138 5 143 C 1.78328e+05 H 2.99593e+04 R 83.20 M -0.00 T 0.0762 pruned 152 4 156 C 1.74579e+05 H 3.03576e+04 R 82.61 M -0.00 T 0.0745 pruned 167 5 172 C 1.81337e+05 H 3.14710e+04 R 82.65 M -0.00 T 0.0681 pruned 149 7 156 C 1.89832e+05 H 3.95510e+04 R 79.17 M -0.00 T 0.0536 pruned 178 7 185 C 1.78410e+05 H 3.90489e+04 R 78.11 M -0.00 T 0.0526 pruned 184 7 191 C 2.19716e+05 H 3.99658e+04 R 81.81 M -0.00 T 0.0490 10 0.328 (0.103 0.220) 19634 0 4666 = 80.8 0.0 19.2 % 0.0 2 90 99 0 1 68 Theline Iter Err ( Ep Ew) OK UNK NG OK UNK NG comprisescolumnheadersthatpertaintothosesubsequentlines thatbeginwithaninteger("firstprogresslines");eachfirst progresslineisfollowedbya"secondprogressline,T^andthereare "pruninglines"ifBoltzmannpruningisused.Thesethreetypesof linesarediscussedbelow,secondprogresslinesfirstbecause someofthecalculationsusedtoproducethemarelaterused to makethefirstprogresslines. A.1.1.2.1 Second progress lines Thesearethelinesthatbeginwithfractionalnumbers; thefirstofthemintheaboveexampleis 0.0 0 4 19 0 0 70 Ignoringforamomentthefirstvalueinsuchaline, theremainingvaluesarethe"percentages"rightbyclass,which mlpcalculatesasfollows.Itmaintainsthree pattern-weight-accumulatorsforeachclass: ( )ria =rightpattern-weight-accumulatorforcorrectclassi ( )wia =wrongpattern-weight-accumulatorforcorrectclassi ( )uia =unknown(rejected)pattern-weight-accumulatorforcorrectclassi Whenmlpsendsatrainingpatternthroughthenetworktheresultis anoutputactivationforeachclass;thehypotheticalclassis, ofcourse,whicheverclassreceivesthehighestactivation.If the highestactivationequalsorexceedstherejection threshold oklvlsetinthespecfile,then mlp acceptsits resultforthispattern,andaddsits pattern-weight(Section0)toeither ( )ria or ( )wia (where iisthecorrectclassofthepattern)accordingtowhether thenetworkclassifiedthepatternrightlyorwrongly. Otherwise,(i.e.ifthehighestactivationislessthen oklvl) mlpadds thepattern-weightto ( )uia .Theseaccumulators reachtheirfinalvaluesafterallofthetraining patternsaresentthroughthenetwork. Mlpthendefines theright"percentage"ofcorrectclass itobe ( ) ( ) ( ) ( )uiwiri ri aaa a ++ 100 Itshowsthesevalues,roundedtointegers,inthesecondprogress lines,asthevaluesafterthefirstone.Forexample,thesecond progresslineaboveshowsthattheright"percentages"of correct classes0and1are0and4.13 If priorsis allsamethenthe pattern-weightsareallequalandso ( )ria ,etc.arethe numbers classifiedrightly,etc.timesthissingle pattern-weight;thepattern-weightcancelsoutbetweenthe numeratoranddenominatoroftheaboveformula,so thattheresultingvaluereallyisthe percentageofthepatternsofclassithatthenetworkclassified rightly.Ifpriorshasavalueotherthanallsame(i.e. class, pattern,or both)thentheright"percentages"oftheclassesare notthe simplepercentagesbutratherareweightedquantities, whichmaymakemoresensethanthesimplepercentagesifsome patternsshouldhavemoreimpactthanothers,asindicatedby their largerweights.14 Asforthefirstvalueofasecond progressline,thisismerelytheminimumof theright"percentages"oftheclasses,butshown roundedtothenearesttenthratherthantothenearest integer.Thisminimumvalue showshowthenetworkisdoingonits"worst"class.15 A.1.1.2.2 First progress lines Thesearethelinesthatbeginwithaninteger.Thecolumnheadings, whichpertaintotheselines,andthe firstoftheselinesintheexample,are: Iter Err ( Ep Ew) OK UNK NG OK UNK NG 0 0.474 (0.240 0.289) 6564 0 17736 = 27.0 0.0 73.0 % Thevaluesinafirstprogresslinehavethefollowingmeanings: Iter: Trainingiterationnumber,numberingstartingat0.A firstprogressline(andsecondprogressline)areproduced everynfreq'thiteration(setinthespecfile). Err, Ep, Ew:Thecalculationsleadingtothesevaluesareasfollows. 13InthiscasetheclassesS,indexnumbersT^are0through5andthe classesarefingerprinttypesArch(A),LeftL oop (L), RightLoop(R),TentedArch(T),Scar(S),andWhorl(W) . Inthis discussion,S,class iT^merelymeanstheclasswhose indexnumber,numberstartingat0,is i.Notealsothatalthough thesoftwareusesclassindexnumbers thatstartat0,theclass indexnumbersitwritestolong_outfilestartat1.14In particular,ifthetrainingpatternssetissuchthatthe proportionsofthepatternsbelongingtothevariousclasses arenotapproximatelyequaltothenaturalfrequenciesofthe classes,thenitmaybeagoodideatouse class-weights(priorssettoclass,andclass-weightsprovided rnafile)tocompensatefortheerroneousdistribution.See[49]. 15WhenmlpuseshybridSCG/LBFGStrainingratherthanonlySCG(it doesthisonlyifpruningisnotspecified)it switchesfromSCGto LBFGSwhentheminimumreachesorexceedsaspecified threshold,scg_earlystop_pct. Mlpprintsthe Err, Ep and Ewvaluesasdefinedabove.Note thatthevalue mlpattemptstominimizeis E,butpresumablythe sameeffectwouldbehadbyattemptingtominimize Err, sinceitisanincreasingfunctionofE. N n aij tij ( )patiw ( )msepatiE , ( )mseE1 ( )1,typepatiE ( )11typeE ( )possumpatiE , ( )11typeE 1E Ep ( )wsqs Ew E Err = = = = = = = = = = = = = = = = = numberofpatterns numberofclasses activationproducedby patterniatoutputnodej(i.e.classj) targetvalue foraij pattern-weightofpatterni(SectionA.1.1) ( )\Delta - = - 1 0 2n j ijij ta ,errorcontributionforpatterniiferrfuncismse ( ) ( )\Delta - = 1 0 ,21 N i msepatipati Ewn ( )( )\Delta ` --+- kj ijik aaffexp1 11 ,wherekiscorrectclassofpatterni, errorcontributionforpatterniiferrfuncistype_1(#isalpha) ( ) ( )\Delta - = 1 0 1,1 N i typepatipati Ewn ( )\Delta - = -+- 1 0 110 n j ijijijij tata ,errorcontributionforpatterniif errfuncispos_sum ( ) ( )\Delta - = 1 0 ,1 N i possumpatipati Ewn ( )mseE1 , ( )mseE1 ,or ( )mseE1 ,accordingtoerrfunc 1E iferrfuncispos_sum, 12E otherwise halfofmeansquarednetworkweight ( )wsqs2 ( )wsqsE *+ regfac1 E2 OK, UNK, NG, OK, UNK, NG: "Numbers"ofpatterns OK(classifiedcorrectly), UNKnown(rejected),andwro NGorNoGood(classifiedincorrectly),thenthecorresponding "percentages.T^ Mlpcalculatesthesevaluesasfollows.Itadds uptheby-class accumulators ( )ria , ( )wia ,and ( )uia definedearliertomakeoverallaccumulators,wherenis thenumberofclasses: ( ) ( )\Delta - == 1 0 n i rir aa ( ) ( )\Delta - == 1 0 n i wiw aa ( ) ( )\Delta - == 1 0 n i uiu aa Itcomputes"numbers"right,wrong,andunknown--thefirst OK, NG,and UNKvaluesofafirstprogressline--asfollows,where Nisthenumberofpatternsandsquarebracketsdenote roundingtoaninteger: ( ) ( ) ( ) ( )uwrrwu aaaa ++= ( ) ( ) ( )][ rwurr aNan = =T^numberT^right ( ) ( ) ( )][ rwuww aNan = =T^numberT^wrong ( ) ( ) ( )wru nnNn --= =T^numberT^unknown Fromthese"numbers,T^ mlp computescorresponding"percentages"--thesecond OK,NG,andUNKvalues--asfollows: ( ) ( ) ][100 Nnp rr = ( ) ( ) ][100 Nnp ww = ( ) ( ) ][100 Nnp uu = If priorsis allsamethensincethe pattern-weightsareallequal,cancellationofthesingle pattern-weightoccursbetweenthenumeratorsanddenominators oftheformulasabovefor ( )rn and ( )wn ,sothattheyreally arethenumbersofpatternsclassifiedrightlyandwrongly.Then itis obviousthat ( )un reallyisthenumberunknownandthat ( )rp ,etc.reallyarethepercentages classifiedrightly,etc. A.1.1.2.3 Pruning lines (optional) Theselines,whichbeginwith"pruned,T^appearifBoltzmann pruningisspecified( boltzmannsettoabs_pruneorsquare_pruneinspecfile, anda temperatureset).Thefirstpruninglineof theexampleis pruned 80 6 86 C 1.67872e+05 H 2.40068e+04 R 85.70 M -0.00 T 0.0841 Regardlessof nfreq, mlpwritesapruninglineeverytimeit performspruning.Thefirstthreevaluesofapruninglinearethe numbersofnetworkweightsthat mlppruned(temporarilysettozero)inthefirstweightslayer,inthesecondlayer,andinbothlayerstogether.Theremaining valuesannouncedbytheletters C,H,R,and M,arecalculatedas follows(thevalueannouncedbyTactuallyisnot calculatedcorrectly,andshouldbeignored): ( )wtsn ( )prunedn ( )unprunedn ( ) ( )minmax , ww C ( )abss log ( )12ws H R M = = = = = = = = = = numberofnetworkweights(bothlayers) numberofweightspruned ( ) ( )prunedwts nn - maximum&minimumabsolutevaluesofunprunedweights ( ) ( ) ( )( ) ( )( )12logloglog minmax +- wwn unpruned =capacity sumoflogarithmsofabsolutevaluesofunprunedweights ( ) ( ) ( ) ( )( ) ( )( )2loglog12log minlog wns unprunedabs -+ C- ( )12ws =entropy ( ) ( ) C1212100 ww ss* meanofunprunedweights A.1.1.3 Confusion Matrices and Miscellaneous Information (Optional) Ifdo_confuseissettotrueinthespecfile,thenextpartofthe outputconsistsoftwo"confusionmatrices"andsome miscellaneousinformation: oklvl 0.00 # Highest two outputs (mean) 0.784 0.145; mean diff 0.639 key name A A L L R R S S T T W W # key: A L R S T W # row: correct, column: actual # A: 333 315 267 0 0 9 # L: 12 7522 86 0 0 144 # R: 21 148 7128 0 0 153 # S: 0 0 0 0 0 0 # T: 60 346 323 0 0 3 # W: 2 798 509 0 0 5985 # unknown # * 0 0 0 0 0 0 percent of true IDs correctly identified (rows) 36 97 96 0 0 82 percent of predicted IDs correctly identified (cols) 78 82 86 0 0 95 # mean highest activation level # row: correct, column: actual # key: A L R S T W # A: 35 43 43 0 0 38 # L: 32 83 41 0 0 48 # R: 32 43 83 0 0 49 # S: 88 4666 4042 0 0 317 # T: 33 49 48 0 0 38 # W: 29 61 58 0 0 85 # unknown # * 0 0 0 0 0 0 Histogram of errors, from 2^(-10) to 1 15899 5322 10477 14278 15596 22398 16728 16005 13376 9364 6357 10.9 3.7 7.2 9.8 10.7 15.4 11.5 11.0 9.2 6.4 4.4% Thefirstlineofthisoptionalsectionoftheoutputshows thevalueoftherejectionthresholdoklvlsetinthe specfile(thiswasalreadyshowninthe header).Thenextlineshowsthemean values,overthetrainingpatternsassentthroughthenetworkat theendoftraining,ofthehighestandsecond-highestoutputnode values,andthemeandifferenceofthesevalues.Nextisatable showingtheshortclassname("key")andlongclassname("name")of eachclass.Inthisexamplethekeysandnamesarethesame,butin generalthenamescanbequitelongwhereasthekeys mustbeno longerthantwocharactersinlength:theshortkeys areusedtolabeltheconfusionmatrices. Nextaretheconfusionmatricesof"numbers"andof"mean highestactivationlevel.T^ Mlphasthefollowingaccumulators: ( )patwtsija =pattern-weightaccumulatorforcorrectclassi andhypotheticalclassj ( )highacija =high-activation accumulatorforcorrectclassiandhypotheticalclassj ( )u highacija , =high-activationunknownaccumulatorforcorrect classi Ifapatternsentthroughthenetworkproducesahighest activationthatmeetsorexceeds oklvl (sothat mlpacceptsits resultforthispattern),then mlpaddsits pattern-weightto ( )patwtsija and addsthehighestactivationto ( )highacija ;whereiandjare thecorrectclassandhypotheticalclassof thepattern. Otherwise,i.e.if mlpfindsthepatterntobeunknown(rejectsthe result),itaddsits pattern-weightto ( )uija (SectionA.1.1.2.1)and addsthehighestactivationto ( )uhighacija , ,where iis thecorrectclassofthepattern.Afterithasprocessedallthe patterns, mlpcalculatestheconfusionmatrix of"numbers"andits"unknown"line;someadditionalinformationconcerning therowsandcolumnsofthatmatrix;andtheconfusion matrixof"meanhighestactivationlevel"andits"unknown"line,asfollows. Firstdefinesomenotation: Mlpcalculatesthevaluesasfollows, where ( )ria , ( )wia , ( )uia andareasdefinedin Section A.1.1.2.1andsquarebracketsagaindenote roundingtoaninteger:16 ( ) ( ) ( )( ) ( ) \Delta \Delta \Theta \Lambda \Xi \Xi \Pi \Sigma += \Delta -=10nj patwtsijui patwtsijpatsiconfuse ij aa aNn ( ) ( ) ( )( ) ( ) ( ) \Delta \Theta \Lambda \Xi \Pi \Sigma ++= uiwiri uipatsiuconfuse i aaa aNn , ( ) ( )( ) ( ) \Delta \Theta \Lambda \Xi \Pi \Sigma -= uconfuseipatsi confuseiirowr i nN np ,, 100 ( ) ( )( ) \Delta \Delta \Theta \Lambda \Xi \Xi \Pi \Sigma = \Delta - = 1 0 , 100n i confuseij confusejjcolr j nnp ( ) ( )( ) \Delta \Delta \Theta \Lambda \Xi \Xi \Pi \Sigma = confuseij highacijconfuse ij n ah 100 ( ) ( )( ) \Delta \Theta \Lambda \Xi \Pi \Sigma = uconfusei uhighaciuconfuse i n ah , ,, 100 If priorsis allsame,thepattern-weightsareallequal,and cancellationofthesinglepattern- weightbetweennumerator anddenominatorcauses ( )confuseijn abovetobe thenumberofpatterns ofcorrectclass iandhypotheticalclass j;similarly, ( )u confusein , reallyisthenumberofpatternsof 16The denominatorsoftheexpressionshownherefor ( )confuseijn and ( )uconfusein , areequal,buttheseexpressionsshow whatthesoftwareactuallycalculates. ( )patiN ( )confuseijn ( )uconfusein , ( )rowrip , ( )colrjp , ( )confuseijh ( )uconfuseih , = = = = = = = numberofpatternsofcorrectclassi valueinrowiandcolumnjof firstconfusionmatrix(ofS,numbersT^) ithvalueofS, unknownT^lineatbottomoffirstconfusionmatrix ithvalueofS, percentoftrueIDscorrectlyidentified(rows)T^line jth valueofS,percentofpredictedIDscorrectly identified(cols)T^line valueinrowiandcolumnjof secondconfusionmatrix jthvalueofS,unknownT^line atbottomofsecondconfusionmatrix correctclass ithatwereunknown; ( )rowrip , and ( )colrjp , reallyarethepercentagesthattheon- diagonal(correctly classified)numbersinthematrixcompriseoftheirrowsand columns respectively; ( )confuseijh reallyis themeanhighestactivationlevel(multipliedby100androunded toaninteger)ofthepatternsofcorrectclass iandhypothetical class j;and ( )uconfuseih , reallyisthe meanhighest activationlevelofthepatternsofcorrectclass ithatwere unknown.If priorshasoneofitsothervalues,the printedvaluesareweightedversionsofthesequantities. Thefinalpartofthisoptionalsectionoftheoutputisahistogram oferrors.Thispertainstotheabsoluteerrorsbetweenoutput activationsandtargetactivations,acrossall outputnodes(6nodes inthisexample)andalltraining patterns(24,300patternsinthisexample),whenthepatternsare sentthroughthetrainednetwork.Oftheresultingsetofabsolute errorvalues(243,000valuesin thisexample),thishistogram showsthenumber(firstline)andpercentage(secondline)ofthese valuesthatfallintoeachofthe11intervals(-$,2-10],(2-10,2-9],E^,(2-1,1]. A.1.1.4 Final Progress Lines Thenextpartoftheoutputconsistsofarepeatofthe column-headersline,finalfirst-progress-line, andfinalsecond-progress-lineofthetraining progresspartoftheoutput,butwithan F prependedtothefinalfirst-progress-line: Iter Err ( Ep Ew) OK UNK NG OK UNK NG F 50 0.098 (0.081 0.040) 21211 0 3089 = 87.3 0.0 12.7 % 0.0 36 97 96 0 0 82 A.1.1.5 Correct-vs.-Rejected Table (Optional) Ifdo_cvrissetto trueinthespecfile,thenextpartoftheoutput isacorrect-vs.-rejectedtable;thefirstandlast fewlinesofthistable,fromtheexampleoutput,are: thresh right unknown wrong correct rejected 1tr 0.000000 21211 0 3089 87.29 0.00 2tr 0.050000 21211 0 3089 87.29 0.00 3tr 0.100000 21211 0 3089 87.29 0.00 4tr 0.150000 21211 0 3089 87.29 0.00 5tr 0.200000 21211 0 3089 87.29 0.00 ... 48tr 0.975000 3777 20521 2 99.95 84.45 49tr 0.980000 3230 21068 2 99.94 86.70 50tr 0.985000 2691 21607 2 99.93 88.92 51tr 0.990000 2141 22158 1 99.95 91.19 52tr 0.995000 1509 22791 0 100.00 93.79 Mlpproducesthesetablevaluesasfollows.Ithasafixedarrayof rejection-thresholdvalues,whichhavebeensetinan unequally-spacedpatternthatworkswell,anditusesthree pattern- weight-accumulatorsforeachthreshold: As mlpsendseachpatternthroughthefinishednetwork, 17itloopsoverthethresholds tk:foreach k,it comparesthehighestnetworkactivationproducedforthepatternwith t ktodecidewhetherthepatternwouldbeacceptedorrejected. Ifaccepted,itaddsthepattern-weightofthat patterneitherto ( )rcvrka , orto ( )wcvrka , accordingtowhetheritclassifiedthepatternrightlyor wrongly;ifrejected,itaddsthepattern-weightto ( )ucvrka , .Afterallthepatternshavebeen throughthenetwork, mlpfinishesthetableasfollows.Foreachthreshold tkitcalculatesthefollowingvalues: Mlpthenwritesalineofthetable.Thevaluesofthelinearethe thresholdindex kplus1with "tr"18 appended, tk("thresh"), ( )rcvrn , ("right"), ( )ucvrn , ("unknown"), ( )wcvrn , ("wrong"), ( )corrcvrp , ("correct"),and ( )rejcvrp , ("rejected").If priorsis allsamethen,sinceallpattern-weightsarethesame,cancellation ofthesinglepattern-weightoccursbetweennumeratoranddenominatorinthe aboveexpressionsfor ( )rcvrn , and ( )wcvrn , ,sotheyreally arethenumberofpatternsclassified rightlyandwronglyif threshold tkisused.Also,itisobviousthat ( )ucvrn , really isthenumberof patternsunknownforthisthreshold, ( )corrcvrp , reallyisthepercentageofthepatternsacceptedat thisthresholdthatwereclassifiedcorrectly,and ( )rejcvrp , reallyisthepercentageofthe Npatternsthatwererejectedatthisthreshold.If priorshasoneofitsothervalues,thenthetabulated valuesareweightedversionsofthesequantities. 17If do_cvris truethen mlpcalculatesacorrectvs.rejecttable,butonlyfor thefinalstateofthenetworkinthetrainingrun. 18forS,trainingT^;thecorrectvs.rejecttableforatestrunusesS,tsT^ tk ( )rcvrka , ( )wcvrka , ( )ucvrka , = = = = kththreshold rightpattern-weight-accumulatorfor kththreshold wrongpattern-weight-accumulatorfor kththreshold unknownpattern-weight-accumulatorfor kththreshold ( )rwucvra , ( )rcvrn , ( )wcvrn , ( )ucvrn , ( )corrcvrp , ( )rejcvrp , = = = = = = ( ) ( ) ( )ucvrkwcvrkrcvrk aaa ,,, ++ ( ) ( )][ ,, rwucvrkrcvrk aNa =S,numberrightT^ ( ) ( ) ][ ,, rwucvrkwcvrk aNa =S,numberwrongT^ ( ) ( )wcvrrcvr nnN ,, -- =S,numberunknownT^(rejected) ( ) ( ) ( )( )wcvrrcvrrcvr nnn ,,,100 + =S, percentagecorrectT^ ( ) Nn ucvr,100 =S,percentagerejectedT^ A.1.1.6 Final Information Thefinalpartoftheoutputshowsmiscellaneous information: Iter 50; ierr 1 : iteration limit Used 51 iterations; 154 function calls; Err 0.098; |g|/|w| 1.603e-04 Rms change in weights 0.289 User+system time used: 3607.3 (s) 1:00:07.3 (h:m:s) Wrote weights as file trn1.wts Thefirstlinehereshowswhatiterationthetrainingrunendedon, andthevalueandmeaningofthereturncode ierr,whichindicateswhy mlpstoppeditstrainingrun:intheexample,thespecified maximumnumberofiterations( niter_max),50,had beenused.Thistrainingrunwas actuallythefirstrunofasequence;itsinitialnetworkweights wererandom,buteachsubsequentrunusedthefinalweightsofthe precedingrunasitsinitialweights.Theonly parametervaried fromoneruntothenextwasthe regularizationfactor regfac,whichwasdecreased ateachstep:successiveregularization.Eachrunwas limitedto50iterations,anditwasassumedthatthis smalliterationlimitwouldbereachedbeforeanyof theotherstoppingconditionsweresatisfied. Whensinusoidactivationfunctionsareused,asinthiscase,best trainingrequiresthatsuccessive regularizationbeused.If sigmoidfunctionsareused,itisjustaswelltodoonlyonetraining run,andinthatcaseoneshouldprobablysettheiterationlimit toalargenumbersothattraining willbestoppedbyoneofthe otherconditions,suchasanerrorgoal(egoal). Thenextline shows:howmanyiterationsmlpused(countingthe0'thiteration; yes,thisisstupidafteritalreadysaidwhatiteration itstoppedon);howmanycallsoftheerrorfunctionitmade; thefinalerrorvalue;andthefinalsizeoftheerrorgradient vector(squarerootofsumofsquares),normalizedbydividingit bythefinalsizeoftheweights.Thenextlineshows theroot- mean-squareofthechangeinweights,betweentheir initialvaluesandtheirfinalvalues.Thenextlineshowsthe combineduserandsystemtimeusedbythetrainingrun. 19The finalline merelyreportsthenameofthefiletowhich mlpwrotethefinalweights. 19Settingtheinitialnetworkweights,readingthe patternsfile,andother(minor)setupwork,arenottimed.