Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

custom Keyword inclusion #36

Open
Vignesh9395 opened this issue Jan 15, 2021 · 0 comments
Open

custom Keyword inclusion #36

Vignesh9395 opened this issue Jan 15, 2021 · 0 comments

Comments

@Vignesh9395
Copy link

Problem description

My requirement is, the generated summary should have specific keywords from the input text.

Steps/code/corpus to reproduce

I need the pipeline component to accept keywords as input parameter.

summary = lxr.get_summary(sentences, summary_size=2, threshold=.1, custom_keywords=keywords)

For example,

from lexrank import LexRank
from lexrank.mappings.stopwords import STOPWORDS
from path import Path

documents = []
documents_dir = Path('bbc/politics')

for file_path in documents_dir.files('*.txt'):
    with file_path.open(mode='rt', encoding='utf-8') as fp:
        documents.append(fp.readlines())

lxr = LexRank(documents, stopwords=STOPWORDS['en'])

# example text
sentences = [
    'One of David Cameron\'s closest friends and Conservative allies, '
    'George Osborne rose rapidly after becoming MP for Tatton in 2001.',

    'Michael Howard promoted him from shadow chief secretary to the '
    'Treasury to shadow chancellor in May 2005, at the age of 34.',

    'Mr Osborne took a key role in the election campaign and has been at '
    'the forefront of the debate on how to deal with the recession and '
    'the UK\'s spending deficit.',

    'Even before Mr Cameron became leader the two were being likened to '
    'Labour\'s Blair/Brown duo. The two have emulated them by becoming '
    'prime minister and chancellor, but will want to avoid the spats.',

    'Before entering Parliament, he was a special adviser in the '
    'agriculture department when the Tories were in government and later '
    'served as political secretary to William Hague.',

    'The BBC understands that as chancellor, Mr Osborne, along with the '
    'Treasury will retain responsibility for overseeing banks and '
    'financial regulation.',

    'Mr Osborne said the coalition government was planning to change the '
    'tax system \"to make it fairer for people on low and middle '
    'incomes\", and undertake \"long-term structural reform\" of the '
    'banking sector, education and the welfare state.',
]

# keywords
keywords = ['Michael Howard', 'chief secretary', 'BBC', 'Mr Osborne', 'Treasury' ]

# get summary with classical LexRank algorithm
summary = lxr.get_summary(sentences, summary_size=2, threshold=.1, custom_keywords=keywords)
print(summary)

Output

[ 'Michael Howard promoted him from shadow chief secretary to the '
'Treasury to shadow chancellor in May 2005, at the age of 34.',
'The BBC understands that as chancellor, Mr Osborne, along with the '
'Treasury will retain responsibility for overseeing banks and '
'financial regulation.']

As in above example, I need a parameter to include custom keywords and those keywords must be present in the summarized text.
(i.e) The sentences with the keywords should be the top ranked sentences.

Is there a way to do this? or any function that does this present as part of the library?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant