Monday, September 24, 2018

Data Science & Analytics: Lecture 1

This is a Condensed Notes on Dr. Eugene Rex Jalao's Lectures on Data Science & Analytics Subject. Dr. Jalao is a Professor in Industrial Engineering from UP Diliman.

I. What is Business Analytics?

- it is the utilisation of organisational and external data to provide: Timely, Accurate, High-value, Actionable decisions. (TAHA)

- it is an umbrella terms that combines: Architecture, Tools, Databases, Analytical Tools, Applications and Methodologies. (ATDAAM)

- the main goal of business analytics is to provide easy access to data and models to make managers capable of making analysis.

- it is an entire field encompassing technology, work processes, human and organisational factors.

- business analytics is not a tool, or a collection of reports, dashboard, and visualisation.

II. History of Business Analytics

1. Early 1970's & Mid-1980's
- paper report stored in early databases.

2. 1980's to 1990's
- Start of early automation, paper reports with decision support systems stored in early data warehouses.

3. Rest of 1990's
- the rise of Online Analytical Platforms, less paper were used and more data warehousing and data-marts.

4. 2000's
- the rise of next generation OLAP integrated with data mining and visualisation.

5. Early 2010's
- The rise of business intelligence and visualisation.

III. Top Technological Strategy Trends for 2018

1. Intelligent:
a. A.I. foundations
b. Intelligent apps & analytics
c. Intelligent things

2. Digital:
a. Digital twins
b. Cloud on the edge
c. Conversational platforms & immersive experiences

3. Mesh
a. Blockchain
b. event driven
c. Continuous adaptive trust & risk

IV. Priority Ranking of I.T. Technology Investment as of 2018
1. B.I. & Big Data
2. Process Automation
3. Cloud Software as a Service
4. Service Management
5. Legacy System Enhancements

VI. Some Challenges Data Professionals Faced
1. Dirty data
2. Lack of data science talents
3. Corporate politics
4. Lack of support
5. Access to data

VII. Business Analytics Framework
1. Source Systems
- are the sources of data that drives the analytics.
- examples: OLTP systems, ERP systems, external data, other sources of data.

2. Integration systems
- Systems that extracts, transforms, and loads data into data warehouses.

3. Data Management Systems
- Data warehouses that stores and loads cleaned data for analytics.

4. Analytics
- examples: EDA, data mining, optimization, and simulation

VIII. Different Types of Business Analytics
a. Descriptive Analytics
- also called exploratory data analysis
- answers the questions:
1. What happened and why?
2. What is happening now?

- the purpose is to describe and summarize the data using graphs and basic statistical techniques to generate reports, dashboards and visualizations.

b. Predictive Analytics
- machine learning and data mining, falls under predictive analytics.
- answers the questions:
1. What is likely to happen?
2. Tell me something interesting, without me asking

- it finds patterns and trends based on historical data to provide useful information for decision making.

- there are two types of machine learning:
1. Supervised Machine Learning
- tries to predict a labelled response variable.
- examples are:
a. Classification: prediction of a categorical response variable.
b. Regression: prediction of a numerical response variable.
c. Time Series: prediction of a numerical response variable based on time predictors.

2. Unsupervised Machine Learning
- the data in unsupervised learning are unlabelled.
- the method "fishes" for patterns.
- examples of unsupervised machine learning are:
Clustering
- it divides datapoint into groups called clusters or segments.
- the variance of data points within the same cluster must be as minimum as possible called coherence, while the variance between two cluster must be as large as possible called separation.

Association Rule
- identify strong rules to associate two items together through a measure of interestingness.
- measures the probability that one item is associated with another item.

Sequential Pattern Analysis
- given a series of items, and its corresponding time sequence, provides an apriori rules that measures that probability that the a precedent item will occur, given an antecedent item have occurred prior.

Text Mining
- uses term document frequency - inverse frequency to determine the dominant words in a text.
- other method uses perplexity and Shannon entropy to determine informaticve words.

Social Media Sentiment Analysis
- from a given set of texts, determines the probability of a sentiment.
- methods used can be naive Bayes classifiers, decision trees, and other classification methods.

c. Prescriptive Analytics
- optimization and simulation falls under prescriptive analytics.
- in optimization, a solution is provided given a series of constraints.
- in simulation, imitates the natural system, and provides an artificial alternative along with inferences.

IX. APEC Recommended Competencies
a. Business & Organizational skills
- involves business analytics, visualization, data management & governance, domain knowledge.

b. Technical skills
- involves statistical techniques, computing, data analytics & research methods.

c. Workplace skills
- communication, storytelling, ethics, and entrepreneurship.

No comments:

Post a Comment