Skip to content

Commit

Permalink
v1.0
Browse files Browse the repository at this point in the history
  • Loading branch information
kevindeeboman committed Jan 6, 2021
1 parent 6602520 commit 6b85574
Showing 1 changed file with 4 additions and 3 deletions.
7 changes: 4 additions & 3 deletions masters_project.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@
"source": [
"### Step 1: Converting JSON files to CSV and extracting relevant data (+data cleaning)\n",
"#### All job ads data used in this project is from [JobTechDev](https://jobtechdev.se/en/docs/apis/historical/), an initiative by the Swedish public employment service.\n",
"- The complete dataset is about 30.8 GB and files are in the JSON-format\n",
"- The complete dataset is about 30.8 GB (unzipped) and files are in the JSON-format\n",
"- Dataset contains 32 differerent columns/vars, many of which are beyond the scope of the project, as such a selection will be extracted.\n",
"\n",
"- Variables to be extracted for this project are the following:\n",
Expand Down Expand Up @@ -101,7 +101,8 @@
" # Removal of special characters using regex, events of \\n causes errors in csv file #\n",
" head_line = re.sub('[!,*)@#%(&$_?.^\\\\\\\\\\n/]', '', str(ad['headline']))\n",
" if file != 2017:\n",
" try: # Slicing of date variables is to ensure only dates of the format 'yyyy-mm-dd') are included, no time data needed #\n",
" try: \n",
" # Slicing of date variables is to ensure only dates of the format 'yyyy-mm-dd' are included, no time data needed #\n",
" ad_select = [head_line, int(ad['number_of_vacancies']), ad['publication_date'][:10], ad['application_deadline'][:10]]\n",
" except:\n",
" error_rows += 1\n",
Expand Down Expand Up @@ -4626,7 +4627,7 @@
"- In the graph above we observe the daily labor demand index plotted together with Statistics Sweden's quarterly survey. Visually, the two time series show a clear co-movement. Thus, it is expect that much of the variation in the official survey could be explained by the daily index.\n",
"- It seems that changes in labor demand in the daily index based on job ads happen about one quarter before the change is seen in the quarterly survey index. \n",
"- Volatility of the daily index increases substantially from 2016-2020. It should be investigated further if this change reflects a real change in labor demand or if the change is due to changes in data management at Arbetsförmedlingen. As seen in the output in step 1, erroneous ads start to decrease from 2016 and onwards which could affect number of vacancies.\n",
"- Given that the daily labor demand index is behaving as a higher-frequency version of Statistics Sweden's quarterly survey, it could potentially be used as a cheaper and more timely complement."
"- Given that the daily labor demand index is behaving as a higher-frequency version of Statistics Sweden's quarterly survey, it could potentially be used as a cheap and timely complement."
]
},
{
Expand Down

0 comments on commit 6b85574

Please sign in to comment.