Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error case for cut #3

Open
Argons opened this issue Jan 29, 2015 · 3 comments
Open

error case for cut #3

Argons opened this issue Jan 29, 2015 · 3 comments

Comments

@Argons
Copy link

Argons commented Jan 29, 2015

在cut类似“东方XXX”的词时,eg:东方电视台,会无法返回分词结果,但也不会报错。

@jannson
Copy link
Owner

jannson commented Feb 1, 2015

好的,我跟踪下 :)

@zyson
Copy link

zyson commented Mar 7, 2017

个人认为是在yaha/init.py文件的人名前缀匹配函数SurnameCutting2()中,while循环中的判断条件没有考虑到klen2大于2的情况,会造成死循环导致的。可以这么改看看
`
class SurnameCutting2(CuttingBase):
def init(self):
super(CuttingBase, self).init()
self.stage = 4
self.dict = get_dict(DICTS.SURNAME)
self.stop_dict = get_dict(DICTS.EXT_STOPWORD)

def cut_stage4(self, cuttor, sentence, graph, contex):
    start = contex['index']
    path = contex['path']
    new_path = contex['new_path']
    n = len(path)
    i = start
    while i < n:
        klen = path[i] - path[i-1]
        print 'surname', klen, sentence[path[i-1]:path[i]]
        if klen >=1 and klen <= 2 and self.dict.has_key(sentence[path[i-1]:path[i]]):
            if i < n-2:
                klen2 = path[i+1]-path[i]
                klen3 = path[i+2]-path[i+1]
                if klen2==1 and klen3==1 and not self.stop_dict.has_key(sentence[path[i+1]:path[i+2]]):
                    new_path.append(path[i+2])
                    i += 3
                    continue
                elif klen2 <= 2 and klen2 >= 1:
                    new_path.append(path[i+1])
                    i += 2
                    continue
                else:
                    new_path.append(path[i])
                    i += 1
                    continue
            elif i < n-1:
                klen2 = path[i+1]-path[i]
                if klen2 <= 2 and klen2 >= 1:
                    new_path.append(path[i+1])
                    # the code like a bug
                    # i = path[i+1]
                    i += 2
                    continue
                else:
                    new_path.append(path[i])
                    i += 1
                    continue
            else:
                return i
        else:
            # next stage to doit
            return i
    return i

`

@jannson
Copy link
Owner

jannson commented Mar 7, 2017

好的,感谢,我改着试试。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants