Skip to content

Commit

Permalink
update_doc
Browse files Browse the repository at this point in the history
  • Loading branch information
yangheng95 committed May 1, 2022
1 parent 0e8bb59 commit 59d06b1
Show file tree
Hide file tree
Showing 5 changed files with 22 additions and 17 deletions.
6 changes: 3 additions & 3 deletions README.MD
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,9 @@
# | [Overview](./README.MD) | [HuggingfaceHub](readme/huggingface_readme.md) | [ABDADatasets](readme/dataset_readme.md) | [ABSA Models](readme/model_readme.md) | [Colab Tutorials](readme/tutorial_readme.md) |

## Try our demos on Huggingface Space
- [Aspect-based sentiment classification (Multilingual)](https://huggingface.co/spaces/yangheng/PyABSA-APC)
- [Aspect term extraction & sentiment classification (Multilingual)](https://huggingface.co/spaces/yangheng/PyABSA-ATEPC)
- [方面术语提取和情感分类(中文)](https://huggingface.co/spaces/yangheng/PyABSA-ATEPC-Chinese)
- [Aspect-based sentiment classification (Multilingual)](https://huggingface.co/spaces/yangheng/PyABSA-APC) (English, Chinese, etc.)
- [Aspect term extraction & sentiment classification](https://huggingface.co/spaces/yangheng/PyABSA-ATEPC) (English, Chinese, Arabic, Dutch, French, Russian, Spanish, Turkish, etc.)
- [方面术语提取和情感分类](https://huggingface.co/spaces/yangheng/PyABSA-ATEPC-Chinese) (中文, etc.)

## Package Overview

Expand Down
13 changes: 13 additions & 0 deletions pyabsa/core/atepc/dataset_utils/atepc_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
get_cdw_vec,
get_lca_ids_and_cdm_vec)


# It is hard to tokenize multilingual text, I decide to use a pretrained tokenizer, you can alter according to your demands
# tokenizer = AutoTokenizer.from_pretrained('bert-base-multilingual-cased')

Expand Down Expand Up @@ -46,6 +47,18 @@ def split_text(text):
return word_list


def process_iob_tags(iob_tags: list) -> list:
for i in range(len(iob_tags) - 1):

if iob_tags[i] == 'O' and 'ASP' in iob_tags[i+1]:
iob_tags[i+1] = 'B-ASP'

# if 'ASP' in iob_tags[i] and 'B-ASP' in iob_tags[i + 1]:
# iob_tags[i + 1] = 'I-ASP'

return iob_tags


def prepare_input_for_atepc(opt, tokenizer, text_left, text_right, aspect):
if hasattr(opt, 'dynamic_truncate') and opt.dynamic_truncate:
_max_seq_len = opt.max_seq_len - len(aspect.split())
Expand Down
6 changes: 3 additions & 3 deletions pyabsa/core/atepc/prediction/aspect_extractor.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,8 @@

from pyabsa.functional.dataset import detect_infer_dataset, DatasetItem
from pyabsa.core.atepc.models import ATEPCModelList
from pyabsa.core.atepc.dataset_utils.atepc_utils import load_atepc_inference_datasets
from pyabsa.utils.pyabsa_utils import print_args, save_json, TransformerConnectionError, iob_processing
from pyabsa.core.atepc.dataset_utils.atepc_utils import load_atepc_inference_datasets, process_iob_tags
from pyabsa.utils.pyabsa_utils import print_args, save_json, TransformerConnectionError
from ..dataset_utils.data_utils_for_inference import (ATEPCProcessor,
convert_ate_examples_to_features,
convert_apc_examples_to_features,
Expand Down Expand Up @@ -295,7 +295,7 @@ def _extract(self, examples, infer_batch_size=256):

POLARITY_PADDING = [SENTIMENT_PADDING] * len(polarity)
example_id = i_batch * self.opt.infer_batch_size + i
pred_iobs = iob_processing(pred_iobs)
pred_iobs = process_iob_tags(pred_iobs)
for idx in range(1, len(polarity)):

if polarity[idx - 1] != str(SENTIMENT_PADDING) and split_aspect(pred_iobs[idx - 1], pred_iobs[idx]):
Expand Down
8 changes: 0 additions & 8 deletions pyabsa/utils/pyabsa_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -93,14 +93,6 @@ def check_and_fix_IOB_labels(label_map, opt):
opt.index_to_IOB_label = index_to_IOB_label


def iob_processing(iobs: list):
_iobs = iobs[:]
for i in range(1, len(iobs)):
if iobs[i - 1] == 'O' and 'ASP' in iobs[i]:
_iobs[i] = 'B-ASP'

return _iobs


def get_device(auto_device):
if isinstance(auto_device, str) and auto_device == 'allcuda':
Expand Down
6 changes: 3 additions & 3 deletions readme/huggingface_readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,9 @@


## Try our demos on Huggingface Space
- [Aspect-based sentiment classification (Multilingual)](https://huggingface.co/spaces/yangheng/PyABSA-APC)
- [Aspect term extraction & sentiment classification (Multilingual)](https://huggingface.co/spaces/yangheng/PyABSA-ATEPC)
- [方面术语提取和情感分类(中文)](https://huggingface.co/spaces/yangheng/PyABSA-ATEPC-Chinese)
- [Aspect-based sentiment classification (Multilingual)](https://huggingface.co/spaces/yangheng/PyABSA-APC) (English, Chinese, etc.)
- [Aspect term extraction & sentiment classification](https://huggingface.co/spaces/yangheng/PyABSA-ATEPC) (English, Chinese, Arabic, Dutch, French, Russian, Spanish, Turkish, etc.)
- [方面术语提取和情感分类](https://huggingface.co/spaces/yangheng/PyABSA-ATEPC-Chinese) (中文, etc.)

## Try our demos on Huggingface Space via API
- [Aspect-based sentiment classification (Multilingual)](https://huggingface.co/spaces/yangheng/PyABSA-APC)
Expand Down

0 comments on commit 59d06b1

Please sign in to comment.