conllreader

Build syntactic trees from CoNLL parser output

exception conllreader.ConllInvalidPositionError(bad_root, max_root)[source]

Bases: builtins.Exception

Trying to build a subtree from a node that does not exist

Variables:
  • bad_root – integer, the position from which we attempted to build a subtree
  • max_root – integer, the last valid position
class conllreader.ConllSemanticAppender(syntactic_conll_file)[source]

Bases: builtins.object

Appends semantic information at the “right” of a ConLL file.

The input is a syntactic ConLL file, and the output a so-called semantic CoNLL file.

add_frame_annotation(frame_annotation)[source]
add_new_column(sentence_id)[source]
dump_semantic_file(filename)[source]
conllreader.LongestCommonSubstring(S1, S2)[source]
class conllreader.SyntacticTreeBuilder(conll_tree)[source]

Bases: builtins.object

Wrapper class for the building of a syntactic tree

Variables:
  • node_dict – every SyntacticTreeNode available by CoNLL word id
  • father_ids – every dependency relation: child id -> father id
  • tree_list – every root node, that is every root subtree
  • sentence – the “sentence” (words separated by spaces)
fill_begin_end(node)[source]

Fill begin/end values of very subtree

class conllreader.SyntacticTreeNode(word_id, word, pos, deprel, begin_word)[source]

Bases: builtins.object

A node (internal or terminal) of a syntactic tree

Variables:
  • word_id – int, the CoNLL word id (starts at 1)
  • word – string, the word contained by the node
  • pos – part-of-speech of the node
  • deprel – string, function attributed by the parser to this word
  • father – SyntacticTreeBuilder, the father of this node
  • children – SyntacticTreeNode list, the children of this node
  • begin – int, the character position this phrase starts at (root would be 0)
  • end – int, the position this phrase ends at (root would be last character)
  • begin_word – int, the position this word begins at
_closest_match_as_node_lcs(arg)[source]
closest_match(arg)[source]

Search the closest match to arg

closest_match_as_node(arg)[source]
contains(arg)[source]

Search an exact argument in all subtrees

flat()[source]

Return the tokenized sentence from the parse tree.

class conllreader.TreeBuilderTest(methodName='runTest')[source]

Bases: unittest.case.TestCase

setUp()[source]
test_another_flat()[source]
test_lima_tree()[source]
test_none_fathers()[source]
test_tree_contains()[source]
test_tree_flat()[source]
test_tree_match()[source]
test_tree_str()[source]