6 Excel Add Ins Every Data Analyst and Data Scientist Must install in their Machines


No matter what you do, you can’t avoid excel. So, may as well dive into it & tame the beast.

Getting the insights illustrations

Here are six excel Add Ins that every data analyst and data scientist must install in their machines.

1.KUTOOLS FOR EXCEL

Kutools for excel features
  1. Kutools for Excel is a handy Excel add-in with more than 300 advanced features to simplify various kinds of complicated tasks into a few clicks in Excel. For example,Excel users can easily combine worksheets with several clicks, merge cells without losing data, paste to only visible cells, and so on.

How do I insert Kutools in Excel?Click the Office Button in Microsoft Excel 2007[if you still using the old version], and click the File tab in Excel 2010/2013/2016; Click the (Excel) Options button; Then you will enter into the Excel Options window, go on to click the Add-Ins button. Now you can easily view and manage all Add-ins in Excel.

How do I insert Kutools in Excel?

Excel add ins

2. POWER Query

Power Query:

  • Power Query enhances self-service business intelligence (BI) for Excel with an intuitive and consistent experience for discovering, combining, and refining data across a wide variety of sources including relational, structured and semi-structured, OData, Web, Hadoop, and more.
Power query in Excel
  • Microsoft Power Query for Excel, is a new add-in that provides a seamless experience for data discovery, data transformation and enrichment for Information Workers, BI professionals and other Excel users.
  • With Power Query you can:
    • Identify the data you care about from the sources you work with (e.g. relational databases, Excel, text and XML files, OData feeds, web pages, Hadoop HDFS, etc.).
    • Discover relevant data from inside(*) and outside your organization using the search capabilities within Excel.
    • Combine data from multiple, disparate data sources and shape it in order to prepare the data for further analysis in tools like Excel and Power Pivot, or visualization in tools like Power View and Power Map.
    • Share the queries that you created with others within your organization, so they can be easily found via Search. (*)

The Query Editor only appears when you load, edit, or create a new query. To view the Query Editor without loading or editing an existing workbook query, from the Get External Data section in the Power Query ribbon tab, select From Other Sources > Blank Query. The following video shows one way to display the Query Editor.

3. Power Pivot

Top features of Powerpivot for Excel

PowerPivot
  • Import millions of rows of data from multiple data sources    With Powerpivot for Excel, you can import millions of rows of data from multiple data sources into a single Excel workbook, create relationships between heterogeneous data, create calculated columns and measures using formulas, build PivotTables and PivotCharts, and then further analyze the data so that you can make timely business decisions all without requiring IT assistance.
Why to use excel pivot table
  • Enjoy fast calculations and analysis   Process millions of rows in about the same time as thousands, and make the most of multi-core processors and gigabytes of memory for fastest processing of calculations. Overcomes existing limitations for massive data analysis on the desktop with efficient compression algorithms to load even the biggest data sets into memory.
  • Virtually Unlimited Support of Data Sources   Provides the foundation to import and combine source data from any location for massive data analysis on the desktop, including relational databases, multidimensional sources, cloud services, data feeds, Excel files, text files, and data from the Web.
  • Security and Management Power Pivot Management Dashboard enables IT administrators to monitor and manage your shared applications to ensure security, high availability, and performance.
  • Data Analysis Expressions (DAX)   DAX is a new formula language that extends the data manipulation capabilities of Excel to enable more sophisticated and complex grouping, calculation, and analysis. The syntax of DAX formulas is very similar to that of Excel formulas.

4. Solver Add In:

The Solver Add-in is a Microsoft Office Excel add-in program that is available when you install Microsoft Office or Excel.

what if analysis using solver add-ins
There are three kinds of What-if analysis tools that come with Excel, namely Scenario Manager, Goal seek, and data tables.
what if analysis using solver add in in excel

To use the Solver Add-in, however, you first need to load it in Excel.

Solver is a what-if analysis tool for optimization. It is an add-in licensed from Frontline Systems that has shipped with Excel for many years.

solving linear and Nonlinear problems using solver add -in

 Solver helps to find an optimal value in one cell, called the target cell, on your worksheet. It has three solving methods for solving spreadsheet optimization problems.

Solver Add -in
  • Simplex Method  :-The Simplex method is used for solving linear problems. The Simplex solving method has several performance enhancements in Excel 2010 resulting in greatly improved performance for some problem types.
  • GRG Nonlinear :-  The GRG solver is used for solving smooth nonlinear problems. There is a new Multi-start search setting which when used in conjunction with the GRG solver results in better solutions, escaping locally optimal solutions in favor of globally optimal ones.
  • Evolutionary Solver:-  The new Evolution solver accepts Solver models defined in exactly the same way as the Simplex and GRG Solvers, but uses genetic algorithms to find its solutions. While the Simplex and GRG solvers are used for linear and smooth nonlinear problems, the Evolutionary Solver can be used for any Excel formulas or functions, even when they are not linear or smooth nonlinear. Spreadsheet functions such as IF and VLOOKUP fall into this category.

There is a new type of integer constraint known as All Different that form a permutation of integers, making it easy to define models with ordering or sequencing. The well-known travelling salesman problem is hard to define in the current solver, but it can be defined with just an objective and one AllDifferent constraint. Microsoft has added a number of new report types that provide additional detail about the problem being solved.

  • Linearity Report If you try to solve a model that is not linear, Solver will indicate that linearity conditions were not satisfied.
  • Feasibility Report  If you try to solve a model that has no feasible solution, Solver will display the message “Solver could not find a feasible solution” in the Solver Results dialog.
  • Population Report   Where the Simplex and GRG Solvers find a single solution, the Evolutionary Solver creates a population of candidate solutions. When this Solver Engine stops, the best solution from the population is displayed on the worksheet, but a new report is available in the Solver Results dialog.

5. Analysis Toolpak

If you need to develop complex statistical or engineering analyses, you can save steps and time by using the Analysis ToolPak. You provide the data and parameters for each analysis, and the tool uses the appropriate statistical or engineering macro functions to calculate and display the results in an output table. Some tools generate charts in addition to output tables.

The Analysis ToolPak includes the various tools some of them are described below.

  • ANOVA  The Anova analysis tools provide different types of variance analysis. The tool that you should use depends on the number of factors and the number of samples that you have from the populations that you want to test.
  • Descriptive Statistics   The Descriptive Statistics analysis tool generates a report of univariate statistics for data in the input range, providing information about the central tendency and variability of your data.
  • Fourier Analysis    The Fourier Analysis tool solves problems in linear systems and analyzes periodic data by using the Fast Fourier Transform (FFT) method to transform data. This tool also supports inverse transformations, in which the inverse of transformed data returns the original data.

6. ASAP UTILITIES

Time saving – speed up your work in Excel

ASAP Utilities is the product of many time-saving features and additions for Excel.
The extensive features are the result of years of experience and development, combined with the feedback from many respected users at different levels of experience. Particularly the feedback from users has been very important, and will continue to be important in further development.

ASAP utilities

ASAP = As Soon As Possible

ASAP Utilities is called this way because it significantly reduces time-consuming work. ASAP is an acronym that stands for “As Soon As Possible” (definition). Some choose to spell it out, others tend to pronounce it “aysap”. The word utilities means something useful (definition). Combined together ASAP Utilities stands for useful time-saving macro tools for Excel.
Most people struggle with managers, bosses or clients that tend to ask for you to get the job done yesterday, which of course is often difficult to pull off. ASAP Utilities doesn’t allow you to travel back in time but it can certainly help you to speed up your work in Excel.

Advertisements

Opiods and Machine learning Algorithms Application


Predicting Prescriber Induced Overdoses Using Machine Learning Algorithms.

Introduction

For my capstone project, while taking the data science immersive program at General Assembly, I’ve decided to tackle the major public health issue that is the opioid crisis in the united states of America. How do the opioids start to emerge as a public health problem? It’s worth looking at the history. I put them in a very concise way and further researching may be necessary for details.

Screen Shot 2018-07-24 at 23.05.58.png

 

Screen Shot 2018-07-24 at 23.06.14.png
How opioids become a catastrophe

Right now, our country faces an average of 115 opioid overdose deaths each day according to the data from wonder/CDC. Over 280 million prescriptions were written in 2012 alone and are a big contributing factor to this issue. Prescription opioid abuse, misuse, and dependence as a public health hazard is a daily phenomenon now; according to Centers for Disease Control and Prevention (CDC), over 1,000 people are treated in the emergency department every day for misusing prescription opioid drugs. The number of overdose deaths from substance abuse in the US increased from around 9,000 in 1990 to more than 64,000 in 2016, up by 23% increase from the previous year. According to the data source, 75% of drug OD deaths involve opioids (prescribed or illegal). Prescription opiate drugs, especially methadone, oxycontin, and hydrocodone, are believed to have played a significant role in this public health crisis sweeping the US. In addition to the health issues arising from an overdose, opioid epidemic requires significant economic resources from cities and state governments for emergency call response and policing. The estimated total costs of US opioid epidemic reaches over 78 billion dollars.

Screen Shot 2018-07-25 at 08.38.41.png
Opioids: Addictive painkillers

The major source of diverted opioids is physician prescription. However, opioids prescription to patients with acute pain and patients with chronic pain requires a careful distinction. As Opioid is regarded as one of the most effective drugs for the acute pain management, limiting its use for patients who are in urgent need of pain control, post-surgical status, cancer patients, and other health crisis would not only be inhumane but also defeat its intended purpose. On the other hand, use of opioids for chronic non-malignant pain control has remained controversial for decades and requires a closer look in regards to the current opioids health crisis.

Screen Shot 2018-07-24 at 22.37.17.png
The severity of chronic pain as compared to Coronary artery disease(CAD)

As data scientists, we can’t really question whether doctors are pushing prescriptions from sponsorship from pharmaceutical companies, or if they are constantly barraged by patients claiming they are in severe pain and actually need the medication, or really, most other factors for that matter. So, what I wanted to do was try and find the likelihood that a provider would prescribe an opioid and identify any patterns, if any, from that data. Based on my personal research from the currently published materials by CDC and the National Institute of health the number one cause of death from age 15–50 years in the US is drug opioids death. I have shown the comparison with other significant events from our history as follows:-

  • Drug OD deaths in 2016: 64,000
  • Vietnam total deaths (1965–1975) 58,000
  • Motor vehicle Accident deaths in 2016 40,000
  • HIV deaths in peak year (1995) 47,000
  • Gun deaths in peak year (1993) 39,000

This study attempts to build a predictive model of a likelihood of a health care provider prescribing opioids drugs to patients with chronic pain. More specifically, we will identify the correlated features of non-opioid drugs prescription history with an opioid prescription. In addition, we will distinguish gender, specialty, and location that are more highly correlated to the prolonged use of a long-term (more than 12 weeks) supply of opioids.

Datasets and Inputs

A total of eight datasets, detailed data, provider summary, national drug summary table, national health expenditure(NHE), property and violent crime by state, mental health deaths, and state drug summary table , were obtained from the web page of the Centers for Medicare and Medicaid Services (CMS),Kaiser Family Foundation (KFF), CDC/wonder as well as Department of Justice . The detailed data contains 5 the information on prescription drugs prescribed by individual health care providers and paid for under the Medicare Part D Prescription Drug Program in the year of 2016. It includes the detailed prescription information such as the brand drug name, the number of patients who filled the drug more than ten times, the aggregate number of day’s supply for which the prescription drug was dispensed.

Screen Shot 2018-07-25 at 08.41.19.png
Sample data on Prescriber’s information 

The dataset provider summary table contains the demographics of the individual prescribers (n=1,131,550). Briefly, it includes the National provider identification(NPI) number, name, gender, address, medical credential, specialty, and Medicare enrolment status. It also includes the summary of the abstracted clinical data such as the number of total claims count from the prescriber, total opioid claim count based on the supply of all prescription drugs, total claims of antibiotic drugs, total claims of high-risk medication (HRM) drugs, total claims of antipsychotic drugs, and more importantly, the number of patients treated with opioid prescriptions, and the ratio of opioid prescription to non-opioid prescription.

The rest two datasets are national drug summary table and state drug summary table. It lists the prescription drug names, whether the drug is categorized as antibiotics (n=78), opioids (n=29), antipsychotic (n=28), HRM (n=68), or others (n=951) and the number of prescribers of that drug grouped by nation and state, respectively. For more detailed information on how the dataset was collected, please refer to the CMS’s webpage here. The other data sets the NHE shows the percentage share of the prescription drugs all inclusive from the national health budget and the data set includes from 1960–2016. The KFF dataset includes all the mental health deaths by a state that has some level of connection with opioids. The crime data set was collected from multiple sources and agencies using the query method in order to see crimes linked to the opioids or not and shows by state, either it happened in an urban area or rural areas of the country.

The box plot was chosen to show the top prescriber-specialists of the opioid drugs abused on the professional category and it turns out that the cardiology (diseases and abnormalities of the heart.) specialists have the highest variability, followed by Endocrinology(the branch of physiology and medicine concerned with endocrine glands and hormones), psychiatry(the study and treatment of mental illness, emotional disturbance, and abnormal behavior.), Nephrology(the branch of medicine that deals with the physiology and diseases of the kidneys.) and pulmonary disease(chronic obstructive pulmonary disease. : pulmonary disease (such as emphysema or chronic bronchitis) that is characterised by chronic typically irreversible airway obstruction resulting in a slowed rate of exhalation —abbreviation COPD).

My Solution Statement

I start by merging two datasets, detailed data and provider summary, to get combined features of providers’ personal information and prescription history of drugs (sorted by its generic names, n=1154). Two new attributes, op_avg_supply and op_longer, are added. Op_avg_supply is the aggregate number of days supply divided by a total of patients for which opioids drugs were dispensed. Op_longer is labeled 1 if the provider has Op_avg_supply greater than 84 days (12 weeks), and 0 if less than 84 days. Then I train supervised classification models according to the op_longer labels to solve this large-scale nonlinear problem. This projects will consist of three parts: exploratory data analysis, train/test split, building pipeline to fine-tune models via Scikit learn, and extension to the neural networks in TensorFlow.

As the detailed data contains more than 25 million instances and its volume is larger than 3.2 GB, it easily takes up the memory of the local machine, especially when we combine it with prescriber summary dataset to make a wide table. As such, we will utilize’s pandas’ TextFileReader module and read in the instances by small chunks (size=100,000). In order to make sure of seamless and reproducible inputs, I automate the data input pipeline via data cleaning, feature selection, and feature scaling.

Exploratory data analysis In order to gain insights, I explore the data by visualizing each attribute on some randomly selected features as shown below. Let’s see the visuals of the data sets as follows

According to the datasets, the number of opioid prescriptions dispensed by doctors steadily increased from 112 million prescriptions in 1992 to a peak of 282 million in 2012, since 2012 declined to 236 million in 2016. In 2016, 6.2 billion hydrocodone pills were distributed nationwide. In 2016, according to the CDC, around 5 billion Oxycodone (ox i koe’ done) tablets were distributed in the United States, which makes Oxycodone (Percocet) the second most prevalent opioid. Based on these I come to the point that it the vicious circle problem, in our country that we are 5% of the world’s population but 80% of the world’s opioids are prescribed here and it’s easy to see the severity of over-prescription of opiates. Based on the dataset the state of California is leading with the highest number of deaths due to overdose. Kentucky has been hit particularly hard with 1,419 reported overdose deaths in 2016, which is 33.5 per 100,000 people according to the CDC wonder data and of those deaths 989 which is 23.6 per 100,000 people involved some type of opioids. it was followed by WestVirginia, which is 884 reported overdose deaths (52 deaths per 100,000 people).

 

statewide.png
Opioid-related deaths count by state

Screen Shot 2018-07-24 at 23.31.59.png
State and Speciality by top prescribed opioids

opiods release.png
Opioids prescription rate versus the Extended release rate by state

Screen Shot 2018-07-25 at 01.25.03.png
Interactive visualization of specialty variability per opioid prescription

speciality wise.png
Specialty variability of prescribers average  opioid supply

Screen Shot 2018-07-24 at 23.43.33.png
State visuals: as the thickness of the color enhances the death amount increases due to opioids. 

 

What do I learn from the datasets comparing states and understanding the above opioid prescription rate and the extended release rate? I will summarise in the following sentence. Opioids bind to receptors in the brain and spinal cord which in turn disrupt the pain signal by activating the reward areas of the brain so that the dopamine hormone to be released and to create the feeling of euphoria, which means being high or happy. The golden standard for any pain medication is morphine and others are converted to morphine milligram equivalent(MME) and we all know morphine and codeine are naturally derived from opium poppy plan more commonly grown in Asia, central and southern America. The illegal drugs according to CDC like heroin are synthesized from morphine. on the other hand, Hydrocodone and Oxycodone are semi-synthetic opioids manufactured in laboratories with natural synthetic ingredients. According to the dataset and above graph between 2007 and 2016, the most widely prescribed opioid almost by all specialties and across all the 50 states was hydrocodone(Vicodin). please see the above graph state and Speciality by the top 11 opioids in my public tableau account using the link to see the interactive graph. methadone is another fully synthetic opioid, which is dispensed for patients recovering from heroin addiction to help them relieve the symptoms of withdrawal. opioid use disorder is one of the clinical terminologies used in the medical practice referring for opioid addiction or abuse. It’s generally known that people who become dependent on any of these painkillers or opioids may experience withdrawal symptoms when they start to stop taking the pills. For any person who worked or practiced in any medical practice, its clear that dependence is often coupled with tolerance, meaning that opioid users need to take increasingly larger doses of the medication for the same effect. The economic terminology for such effect is called marginal utility. What it means is that the additional satisfaction or euphoria an opioid user gets from consuming one more unit of a pill. According to Substance Abuse and Mental Health Services Administration about 11.5 Million American’s age 12 and older misused prescription pain medications in 2016 and around 948,000 or 0.3 % of the united states population age 12 and above used heroin in 2016. Its very significant number and needs more strategic planning and action points, I guess that’s why the Trump Administration on February 9,2018 allocated around 6 billion dollars for the opioid program, with $ 3 billion allocated for 2018 and $ 3billion allocated for 2019 to tackle the epidemics.

Methods

The baseline of the data set to overcome is 53.7%. The problem with machine learning is that building an effective model can require a ton of human input. While working my capstone time was so demanding as i was so engaged working the Washington, DC crime prediction and the data cleaning took’s almost my entire time. I never took it seriously that most data scientists spent 80% of their time cleaning and wrangling the data. Humans have to figure out the right way to transform the data before feeding it to the machine learning model. Then they have to pick the right machine learning model that will learn from the data best, and then there’s a whole bunch of model parameters to tweak that can make the difference between a dud and a Nostradamus-like model. Building these pipelines — i.e., sequences of steps that turn the raw data into a predictive model — can easily take weeks of tinkering depending on the difficulty of the problem. This is obviously a huge issue when machine learning is supposed to allow machines to learn on their own. When almost all my developed models, some of them they work better on the training data and underperformed on test data I used the Tree-based Pipeline Optimisation Tool (TPOT). TPOT is a Python tool that automatically creates and optimizes machine learning pipelines using genetic programming. Think of TPOT as your “Data Science Assistant”: TPOT will automate the most tedious part of machine learning by intelligently exploring thousands of possible pipelines, then recommending the pipelines that work best for your data.

 

TPOT.png
Machine learning pipeline and what parts of the pipeline TPOT automates.

Then, based onTPOT recommends me to use decision tree classifier and when i run my model i end up getting 89% on the test data and 99.8% on my training data set which is different from the TPOT recommendation and after tuning for hyperparameters.

 

Screen Shot 2018-07-25 at 00.59.48.png
decision tree classifier fitted on training data and scored on a test data.

Then i tried other classifiers like random forest Classifier, Logistic Regression, Decision tree, Adaboost classifier and Neural Network using keras). Finally, I chose to apply Neural networks and performed better than the TPOT recommended optimizer. The neural network uses Gender, specialty as my target variable was binary classification problem, opioid prescriber or not opioid prescriber (0 to 1). 

Screen Shot 2018-07-25 at 00.37.45.png

The model trained on 21813 samples, tested on 2424 samples. The neural network model has the number of inputs which I used 354 features as hidden layers with relu activation function. The output layer uses sigmoid since it is a classification problem. The model compiler uses binary cross-entropy as a loss function, Adam optimizer, and accuracy metrics with an early stop to avoid exhaustive training.

Screen Shot 2018-07-25 at 01.04.50.png

The accuracy score of the model was 91.4% with 21.7% loss.

Screen Shot 2018-07-25 at 01.07.08.png
Model accuracy score and loss for NN

Many features came up for all the models. Carisoprodol, diazepam, alprazolam, gabapentin, pregabalin, hydroxychloroquine sulfate, cyclobenzaprine HCl, and tizanidine HCl were highly associated with the label based on all the model. As we expected from the EDA section, of medical specialties, pain management, interventional pain management, rheumatology, hematology, cardiology and family practice and hospice and palliative care were estimated that the label is true with higher probability.

 

A limitation of the study is that the dataset only includes a snapshot of the total population as beneficiaries of Medicare are an elderly population of age greater than 65. The original materials can be accessed here
This article is a living and breathing matter. Your feedback matters to me. Please leave comments on what resources I should add and what you found the most helpful when you were learning machine learning. Thanks a lot for reading!

 

 

 

How I became a Data Scientist


 How I became a Data Scientist

At end of my graduate school at the University of Liverpool, with thoughts of impending graduation, I started thinking for perhaps the first time in my life about who I wanted to be while I was having almost more than five years of extensive experience. I had lived happily as an information hub for more than five years working at different positions from financial Analyst, budget specialist, advanced data analyst.  After graduation,  I buckled down and got started working as a health systems specialist. working as a health system specialist overwhelmingly was how to apply the different methods and tools of cost-saving mechanisms in the health system framework to achieve the strategic goals. Understanding and analysis of the six building blocks of the health system; (i) service delivery, (ii) health workforce, (iii) health information systems, (iv) access to essential medicines, (v) financing, and (vi) leadership/governance, demands a vast expertise of data management, wrangling, manipulation and advanced analytics. I always used to apply the readily available tools that don’t require much programming like SPSS, STATA, LiST, MBB, Microsoft PowerBI. These tools demand less programming skills as they are embedded within like the anaconda version of glueviz(0.10.4) or the orange3(3.4.5).  Some of the tools are industry specific that will help to select high impact low-cost interventions and some are used for simulation and modeling of the projects that need scale-up. After a quick google search on trending careers of the future and cross-referencing the required skills with my own past experiences, I landed naturally on data science. In this post, I will recount the path to my current position as a data scientist, and describe some differences between academic research and industry work – so that if you are considering the same options, you might be better informed about the trade-offs.

What is Data Science?

A famous Venn diagram (google “data science Venn diagram is shown below”) defines data scientists as having skills at the intersection of coding, statistics, and domain expertise.

 

Datascience

They are the people who take a business problem, go prospecting for available and attainable data, re-formulate the question in technical terms, design and implement a statistical and machine learning task, and re-interpret the results for the business client to ultimately answer the original question. That makes it sound like to be a data scientist you need to be a statistician and a computer scientist with years of industry-specific experience. That’s not quite true.

The reality is, data science is both vast and new, with specializations and sub-fields quickly developing. Highly sought-after data scientists are people who are broadly familiar with all aspects of data science while being experts in one or two fields. It is a highly achievable career for health systems graduate students – with some preparation.

How did I become a Data Scientist?

There are plenty of resources online that outline possible paths to becoming a data scientist, however, I chose to go to a specific class based immersive than boot camps. I will simply describe my own experience. From the moment I realized I would enjoy being a Data Scientist I spent three months in an intensive full-time data science immersive course given by the General Assembly. During the three months of immersive course, I learned intensely. My technical knowledge deficit was overwhelming at times. But here, my academic training was an asset. Living with overwhelming stress without it paralyzing you is arguably what “perseverance” is all about.  Like any profession, there are immense jargons and topics, popular and unpopular opinions, the latest and meanest blog posts, all exchanged electronically in an open environment. But I found local and Global instructors who helped me feel at home during the first few weeks of the immersive course.

This immersive class based course was just the beginning of my journey to becoming a data scientist.  What I have learned so far was that the qualifications and projects of a data scientist are quite different from those of an academic, and yet the actual work is quite similar in nature. The great majority of a data scientist’s time is spent defining and re-defining an ambiguous problem until it can be clearly stated and then solved.

 Conclusion 

Once a data scientist finds interesting results, it is crucial to communicate them to the end customer or user. Building a story around a complex issue, supporting that story with evidence derived from data, and interpreting the results into a concrete recommendation for the customer or user, are the central tasks of a data scientist. From this perspective, your graduate training in health economics, statistics or operations research will provide a strong foundation for moving into data science.

Good luck with your career transition!

 

Monitoring and Evaluation


Monitoring and Evaluation and its Rationale

Monitoring and evaluation is part of the management functions of an organization. Monitoring and evaluation at different levels feed into each other.

Monitoring: The ongoing collection of information about the activities and operation of a program. This information is used to determine what the program is actually doing and whether activities are being implemented as intended

 Evaluation: The periodic collection of information about the activities, characteristics, and outcomes of programs in order to make judgments, improve effectiveness, and/or identify lessons learned.

Monitoring asks: “What are we doing?”

Evaluation asks: “What have we achieved? What impact have we had?”

Types of Evaluations

  • Formative
    • Strengthen/improve activities
    • Needs assessment
    • Evaluability assessment
    • Structured conceptualization
    • Implementation evaluation
    • Process evaluation
  • Summative
    • Examine the effects/outcomes
    • Outcome evaluation
    • Impact evaluation
    • Cost-effectiveness

Rationale for M&E

  • M&E help to make informed decisions regarding ongoing programs
  • M&E ensure the most effective and efficient use of resources
  • M&E determine exactly where a program is right on track and where changes need to be considered
  • M&E help stakeholders conclude whether the program is a success
  • M&E inform subsequent decisions about programs
  • M&E preserve institutional memory

Monitoring and Evaluation Frameworks

One of the main purposes of having M&E frameworks is the clarification of the program’s assumptions, goals, and the known or expected relationships among program and environmental factors that may affect the effectiveness of the activities or the outcome of the intervention. Another important purpose of M&E framework design is to define clear levels of results that should occur as the intervention unfolds. These should be realistic and objective impacts that can be measured and assessed.

M&E frameworks rest on the assumptions and objectives of the program within its operating environment. Drawing on those expectations, the M&E framework provides a schematic design showing how various relevant factors, results, and overall outcomes are linked.

In general designing frameworks assists to develop:

  • Clearly understood program/project goals and measurable, long-term, short-term, and intermediate objectives
  • Clearly defined relationships between program/project inputs, processes, outputs, and outcomes, and between program/project activities and the external context (environmental factors)

Goals and Objectives

  • Objectives: statements of desired, specific, realistic
    and measurable program results

    • SMART
      • Specific: identifies concrete events or actions that will take place
      • Measurable: quantifies the amount of resources, activity, or change to be expended and achieved
      • Appropriate: logically relates to the overall problem statement and desired effects of the program
      • Realistic: Provides a realistic dimension that can be achieved with the available resources and plans for implementation
      • Time-based: specifies a time within which the objective will be achieved

Types of Frameworks

Frameworks can be classified in many ways. In this module, we identify four types of frameworks.

  1. Conceptual Frameworks
  2. Results Frameworks
  3. Logical Frameworks
  4. Logic Models
  5. Conceptual Frameworks
    (also known as research or theoretical frameworks)

A diagram that identifies and illustrates the relationships between all relevant systemic, organizational, individual, or other salient factors that may influence program/project operation and the successful achievement of program or project goals.

M&E Purpose of a conceptual framework

  • To show where program fits into a wider context
  • To clarify assumptions about causal relationships
  • To show how program components will operate to influence outcomes
  • To guide identification of indicators
  • To guide impact analysis (causal pathways)