?

Log in

No account? Create an account

为什么简单的PCFG无法达到太高的精确度?

最有说服力的结论还是实验得出的:
( S 1.95955e-05
    ( NP 'I' )
    ( VP 2.61274e-05
        ( VP 0.001008
            ( V 'saw' )
            ( NP 0.00168
                ( NP 'John' )
                ( PP 0.0336
                    ( P 'with' )
                    ( NP 0.056
                        ( Det 'a' )
                        ( N 'dog' ) ) ) ) )
        ( PP 0.0864
            ( P 'with' )
            ( NP 0.216
                ( Det 'my' )
                ( N 'cookie' ) ) ) ) )
( S 1.30637e-05
    ( NP 'I' )
    ( VP 1.74182e-05
        ( V 'saw' )
        ( NP 2.90304e-05
            ( NP 0.00168
                ( NP 'John' )
                ( PP 0.0336
                    ( P 'with' )
                    ( NP 0.056
                        ( Det 'a' )
                        ( N 'dog' ) ) ) )
            ( PP 0.0864
                ( P 'with' )
                ( NP 0.216
                    ( Det 'my' )
                    ( N 'cookie' ) ) ) ) ) )
( S 1.95955e-05
    ( NP 'I' )
    ( VP 2.61274e-05
        ( VP 0.15
            ( V 'saw' )
            ( NP 'John' ) )
        ( PP 0.000580608
            ( P 'with' )
            ( NP 0.00096768
                ( NP 0.056
                    ( Det 'a' )
                    ( N 'dog' ) )
                ( PP 0.0864
                    ( P 'with' )
                    ( NP 0.216
                        ( Det 'my' )
                        ( N 'cookie' ) ) ) ) ) ) )
( S 1.30637e-05
    ( NP 'I' )
    ( VP 1.74182e-05
        ( V 'saw' )
        ( NP 2.90304e-05
            ( NP 'John' )
            ( PP 0.000580608
                ( P 'with' )
                ( NP 0.00096768
                    ( NP 0.056
                        ( Det 'a' )
                        ( N 'dog' ) )
                    ( PP 0.0864
                        ( P 'with' )
                        ( NP 0.216
                            ( Det 'my' )
                            ( N 'cookie' ) ) ) ) ) ) ) )

以上的一个句子的两种歧义,由于条件概率的乘法公式,其NP和PP的组合歧义无法消除。
_________________________________________________________

下一步就是parser的词汇化。看了这么长时间的书,终于要实战了。
Tags: ,

Comments