Software Development: Programming and Algorithms (SDPA) Coursework

The coursework of this unit consists of three main parts:

1- Software Development

In this part, I built a car rental system that allows customers to interact with it including checking available cars in the stock as well enabling the user to rent a car and return it to issue a bill. I instantiated three classes:

View a Snippet of The Source Code Below:



2- Algorithm Analysis

In this part, I wrote a function that takes two lists and produces an output list where the first list (L1) sorted according to the ordering of the elements sequence in the second list (L2). Other elements in the first list (L1) that is not present in the second list (L2), will be inserted at the end of the output-list in sorted order.


3- Data Analytics

In this part, the data source was Twitter. I collected Premier League (PL) related tweets using tweepy built-in library in Python and saved them into a CSV file. I then did a comprehensive exploratory data analysis on this dataset. The variables of interests were
['user', 'text', 'date', 'followers_count', 'retweet_count', 'favorite_count', 'location', 'source', 'tweet_hour'].


I did the following steps:


Dataset Dataframe Overview


Dataset General statistics

Some of the analysis findings

Analysis #1:
In this analysis an attempt to have an overview of tweets stream through a period of time (every ten hours). It seems that the tweets flow were steady throughout the period (December 23rd - December 26th). Then, There is a sharp increase of tweets by the 27th then decreased a bit at the end of the day. Eventually, it rises sharply again on the 28th as these two days 27th and 28th (Sunday and Monday) represents Premier League matches day and so as it is shown on the graph, people interact more on Twitter on these days.


Analysis #2:
In this analysis an attempt to explore the most frequent words in the dataset. based on the wordcloud findings, It seems that the top words that most users tweet about are evolving around the transfer period of time (transfermark), team's coaches such as Man United team's coach (Solskjaer) as well as teams names during matches day. Additionally, Some leading sport TV channels such as (ESPNFC).


Analysis #3:
In this analysis an attempt to explore the tweets time distributions throghout the day. As shown in Histogram graph, it seems that people interact and tweet the most during midday from 12 pm until about 4 pm. This is the time where most PL football matches happen.


Analysis #4:
In this analysis an attempt to find the top PL hashtags in the dataset. Based on the finding it seems that most people interact with the main hashtag of the league which is (#PL) with more than 150 tweets and also (#LIV)Liverpool team as they are number on in the league table at the moment the dataset was collected. Also, there is interactions from Korean people as the third top hashtag (#갓세븐) which has the same meaning of the fourth hashtag (#GOT7).


Analysis #5:
In this analysis the aim is to find the top ten users in the dataset that has the most number of followers. Based on the findings, it seems that most people who are interested in PL football follow sport TV channels such as BBCSport that has more than 8 million followers as shown in the graph.


Analysis #6:
In this analysis I want to have an overview of tweet sources to analyze their occurrence. Based on the findings of the analysis, it seems that the top common tweet sources are Android, iPhone and Web App.


Analysis #7:
In this analysis, I want to have an overview of tweet locations to analyze users interactions with Premier League globally. Based on the findings of the analysis, it seems that most interactive users are from the United Kingdom, Nigeria, India and Ghana.


Analysis #8:
In this analysis I want to search for the teams mentioned in PL related tweets. Based on the findings of the analysis, it seems that the top mentioned teams are Chelsea, Arsenal (P.S. the team I love and support!), Astonvilla and Evertone.






Return to Home Page