Skip to content

Stravanni/large-scale-data-integration

Repository files navigation

Large Scale Data Integration

  1. pip install requirements.txt
  2. install Visual Studio Code
  3. intall the extension for VS code Data Wrangler

PhD Course Assignment: Big Data Integration Project Paper

Objective: Students will have the flexibility to choose between two project options:

  1. Evaluation of Data Integration Techniques: Students will evaluate a set of data integration techniques on a dataset/domain of their choice.
  2. Evaluation and Comparison of Data Integration Techniques: Students will conduct an experimental evaluation and comparison of multiple data integration techniques on known benchmarks (i.e., by employing the datasets employed in the referenced papers).

Assignment Guidelines:

  • Paper Length: ~2 pages long.
  • Project Options: Option 1: Individual Project---Students can choose to work on the project individually. Option 2: Group Project (Two Members)---Students can opt to work in pairs. Each member must specify their contribution to the project.
  • Project Proposal: Before starting the project, students must submit a brief proposal outlining their chosen project option, dataset/domain of interest, and the specific data integration techniques they plan to evaluate or compare.
  • Experimental Design: Define clear objectives for the experiment. Describe the dataset/domain chosen for the evaluation/comparison. Detail the data integration techniques selected for evaluation/comparison. Justify the selection of techniques based on relevance to the chosen dataset/domain and existing literature.
  • Implementation: Implement the chosen data integration techniques within a suitable environment. Document the implementation process, including any challenges faced and solutions devised.
  • Experimental Evaluation: Include relevant metrics for evaluation, such as precision/recall, scalability, and any domain-specific measures if needed. Provide detailed experimental results, including tables, charts, or visualizations where applicable.
  • Comparison (if applicable): If comparing multiple techniques, provide a comparison of their performance based on the selected metrics. Analyze and interpret the results to draw meaningful conclusions.
  • Discussion: Highlight the strengths and weaknesses of the evaluated/compared techniques. Propose potential areas for further research or improvement.
  • Contribution Statement (for group projects): Each group member must include a contribution statement outlining their individual contributions to the project.
  • References: Include a list of all references cited in the paper following the appropriate citation style (e.g., APA, IEEE).
  • Submission Guidelines: send by email a pdf file. Template for the paper
  • Assessment Criteria: The papers will be assessed based on the following criteria:
    • Clarity and organization of the paper.
    • Thoroughness and relevance of the experimental evaluation/comparison.
    • Quality of analysis and interpretation of results.
    • Originality and creativity in the choice of dataset/domain and techniques.
    • Contribution (for group projects).
  • Important Dates:
    • Proposal Submission Deadline: 15/06/2024
    • Final Paper Submission Deadline: 30/06/2024

Note: Any deviations from the assignment guidelines must be discussed and approved by the course instructor in advance.

About

course material

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published