next up previous
Next: 3.4.1 Memory-efficient inference Up: 3 Quick Start Previous: 3.3 Structure Learning

3.4 Inference

To perform inference, run the infer executable, e.g., ALCHDIR/bin/infer -i univ-out.mln -e univ-test.db -r univ.results -q advisedBy,student,professor -c -p -mcmcMaxSteps 20000.

-i specifies the input .mln file. In that file all formulas must be preceded by a weight or terminated by a period (but not both). An exception is a unit formula with variables followed by the ! operator. Such a unit formula can be preceded by a weight, or terminated by a period, or neither. (For such a unit formula, the code automatically creates formulas stating that the variables have mutually exclusive and exhaustive values. See Section 4.) Each formula in the input .mln file is converted to CNF. If a weight precedes the formula, it is divided equally among its CNF clauses. If the formula is terminated by a period (i.e., the formula is hard), each of its CNF clauses is given a default weight that is twice the maximum soft clause weight. If neither weight nor period is specified for a unit formula with variables followed by !, each of its CNF clauses is given a default weight that is 1.5 times the maximum soft clause weight. (See the developer's manual on how to change the default weights.)

-e specifies the evidence .db file; a comma-separated list can be used to specify more than one .db file. -r specifies the output file which contains the inference results.

-q specifies the query predicates. You can specify more than one query predicate, and restrict the query to particular groundings, e.g., -q advisedBy(x,Ida),advisedBy(Ida,Geri). (Depending on the shell you are using, you may have to enclose the query predicates in quotes because of the presence of parentheses.) You can also use the -f option to specify a file (same format as a .db file without false and unknown atoms) containing the query ground atoms you are interested in. (You may use both -q and -f together.)

An evidence predicate is defined as a predicate of which the .db evidence file contains at least one grounding; all evidence predicates are closed-world by default. All non-evidence predicates are open-world by default. The user may specify that some evidence predicates are open-world by listing them with the -o option. Also, the user may specify that some non-evidence predicates are closed-world by listing them with the -c option. This effectively turns them into evidence predicates with all false groundings. If a ground atom is listed as a query atom on the command line or in the query file, or is specified as unknown in the evidence file, this overrides any closed-world defaults or options. If a first-order predicate is listed as a query predicate and the evidence file contains at least one of its groundings, the predicate is open-world. In other words, the openness of query predicates overrides the closedness of evidence ones. If a predicate is simultaneously listed as a query predicate and as closed-world with the -c option, or appears in both -c and -o lists, an error message is returned to the user. If a predicate is closed-world and some of its atoms are query atoms, the predicate is treated as closed-world except for the query atoms. If the user specifies an evidence predicate as closed with the -c option or a non-evidence one as open with -o, a warning message is returned, as these are the defaults. Type ALCHDIR/bin/infer without any parameters to see all available options.

Alchemy supports two basic types of inference: MCMC and MAP/MPE. The current implementation contains three MCMC algorithms: Gibbs sampling (option -p), MC-SAT [6] (option -ms) and simulated tempering [5] (option -simtp). When MCMC inference is run, the probabilities that the query atoms are true are written to the output file specified. -mcmcMaxSteps is used to specify the maximum number of steps in the MCMC algorithm.

To use MAP inference instead, specify either the -m or -a option. The former only returns the true ground atoms, while the latter returns both true and false ones. For MAP inference, the output file also contains the weight assigned to a hard ground clause, fraction of hard ground clauses that are satisfied, the sum of their weights, and the sum of the weights of satisfied soft ground clauses. During MAP inference, each hard clause (derived from a hard formula with a terminating period) is given a weight that is the sum of the soft clause weights plus 10.

The MAP inference engine used in Alchemy attempts to satisfy clauses with positive weights (just as in the original MaxWalkSat algorithm) and keep clauses with negative weights unsatisfied. As an extension to the MaxWalkSat algorithm, when a clause with a negative weight is chosen to fix, one true atom in that clause is chosen at random to be set to false.



Subsections
next up previous
Next: 3.4.1 Memory-efficient inference Up: 3 Quick Start Previous: 3.3 Structure Learning
Marc Sumner 2007-01-16