<-i <string» | Comma-separated input .mln files. (With the -multipleDatabases option, the second file to the last one are used to contain constants from different domains, and they correspond to the .db files specified with the -t option.) |
<-o <string» | Output .mln file containing learned formulas and weights. |
<-t <string» | Comma-separated .db files containing the training database (of true/false ground atoms), including function definitions, e.g. ai.db,graphics.db,languages.db. |
[-ne <string>] | [all predicates] Non-evidence predicates (comma-separated with no space), e.g., cancer,smokes,friends. |
[-multipleDatabases [bool]] | If specified, each .db file belongs to a separate domain; otherwise all .db files belong to the same domain. |
[-beamSize <integer>] | [5] Size of beam in beam search. |
[-minWt <double>] | [0.01] Candidate clauses are discarded if their absolute weights fall below this. |
[-penalty <double>] | [0.01] Each difference between the current and previous version of a candidate clause penalizes the (weighted) pseudo-log-likelihood by this amount. |
[-maxVars <integer>] | [6] Maximum number of variables in learned clauses. |
[-maxNumPredicates <integer>] | [6] Maximum number of predicates in learned clauses. |
[-cacheSize <integer>] | [500] Size in megabytes of the cache that is used to store the clauses (and their counts) that are created during structure learning. |
[-noSampleClauses [bool]] | If specified, compute a clause's number of true groundings exactly, and do not estimate it by sampling its groundings. If not specified, estimate the number by sampling. |
[-delta <double>] | [0.05] (Used only if sampling clauses.) The probability that an estimate a clause's number of true groundings is off by more than epsilon error is less than this value. Used to determine the number of samples of the clause's groundings to draw. |
[-epsilon <double>] | [0.2] (Used only if sampling clauses.) Fractional error from a clause's actual number of true groundings. Used to determine the number of samples of the clause's groundings to draw. |
[-minClauseSamples <integer>] | [-1] (Used only if sampling clauses.) Minimum number of samples of a clause's groundings to draw. (-1: no minimum) |
[-maxClauseSamples <integer>] | [-1] (Used only if sampling clauses.) Maximum number of samples of a clause's groundings to draw. (-1: no maximum) |
[-noSampleAtoms [bool]] | If specified, do not estimate the (weighted) pseudo-log-likelihood by sampling ground atoms; otherwise, estimate the value by sampling. |
[-fractAtoms <double>] | [0.8] (Used only if sampling ground atoms.) Fraction of each predicate's ground atoms to draw. |
[-minAtomSamples <integer>] | [-1] (Used only if sampling ground atoms.) Minimum number of each predicate's ground atoms to draw. (-1: no minimum) |
[-maxAtomSamples <integer>] | [-1] (Used only if sampling ground atoms.) Maximum number of each predicate's ground atoms to draw. (-1: no maximum) |
[-noPrior [bool]] | No Gaussian priors on formula weights. |
[-priorMean <double>] | [0] Means of Gaussian priors on formula weights. By default, for each formula, it is the weight given in the .mln input file, or fraction thereof if the formula turns into multiple clauses. This mean applies if no weight is given in the .mln file. |
[-priorStdDev <double>] | [100] Standard deviations of Gaussian priors on clause weights. |
[-tightMaxIter <integer>] | [10000] Max number of iterations to run L-BFGS-B, the algorithm used to optimize the (weighted) pseudo-log-likelihood. |
[-tightConvThresh <double>] | [1e-5] Fractional change in (weighted) pseudo-log-likelihood at which L-BFGS-B terminates. |
[-looseMaxIter <integer>] | [10] Max number of iterations to run L-BFGS-B when evaluating candidate clauses. |
[-looseConvThresh <double>] | [1e-3] Fractional change in (weighted) pseudo-log-likelihood at which L-BFGS-B terminates when evaluating candidate clauses. |
[-numClausesReEval <integer>] | [10] Keep this number of candidate clauses with the highest estimated scores, and re-evaluate their scores precisely. |
[-noWtPredsEqually [bool]] | If specified, each predicate is not weighted equally. This means that high-arity predicates contribute more to the pseudo-log-likelihood than low-arity ones. If not specified, each predicate is given equal weight in the weighted pseudo-log-likelihood. |
[-startFromEmptyMLN [bool]] | If specified, start structure learning from an empty MLN. If the input .mln contains formulas, they will be added to the candidate clauses created in the first step of beam search. If not specified, begin structure learning from the input .mln file. |
|
structlearn.h contains most of the structure learning code. structlearn.cpp contains the code that handles formulas with variables that are existentially quantified, or have mutually exclusive and exhaustive values.