A deeper analysis of the Enron Scandal (by Charles Bouveyron)

Published 5 years, 11 months ago

Charles Bouveyron is a researcher at Paris Descartes, co-author on the paper that forms the core of Linkage

We focus here on the Enron scandal which was the largest bankruptcy in the U.S. history in 2001 and resulted in more than 4,000 lost jobs. Here, we focus on the period 1, September to 31, December, 2001. We chose this specific time window because it is the denser period in term of sent emails and since it corresponds to a critical period for the company. Indeed, after the announcement early September 2001 that the company was “in the strongest and best shape that it has ever been in,” the Securities and Exchange Commission (SEC) opened an investigation on 31, October for fraud and the company finally filed for bankruptcy on 2, December, 2001.

We consider here the Enron Email data set, which contains all email communications between 149 employees at the head of the famous company from 1999 to 2002. The original dataset is available here. Here, we focus on the period 1, September to 31, December, 2001 and only emails exchanged in this period are considered. The dataset considered here contains 20 940 emails sent between the 149 employees. All messages sent between two individuals were coerced in a single meta-message. Thus, we end up with a dataset of 1 234 directed edges between employees, each edge carrying the text of all messages between two persons.

The data set is available in the "New Job" page, under the "Demonstration data set" panel. Select the "Enron data set" and ask the platform to cluster in a autonomous way (option "Auto") the network. The clustering may take up to a few minutes (a progress bar will appear) since Linkage is doing (in parallel on several CPUs) the clustering for different number of individual groups and different number of topics.

Once Linkage has finished the clustering of the data, you can access the result by clicking on the "View" button in the "Jobs" page. Linkage found that the network is made of 8 clusters of persons and 5 topics of discussion. Linkage displays by default what we can call the "meta-network" since it is the network between the 8 groups. The size of the nodes indicates the proportion of persons in each group, the width of the edges is proportional to the probability of connexion between the groups, and the color of the edges indicates the majority topic between the groups.

Before to go further in the analysis, it is necessary to have a look at the found topics. A summary of the topics is available in the side bar which display the most specific words of each topic. Notice that it is also possible to edit the topic titles to personalize the analysis. A global view of the topics is also proposed in the "Statistics" panel.

The found topics can be easily interpreted by looking at the most specific words of each topic, displayed in the above figure. In a few words, we can summarize the found topics as follows: - Topic 1 refers to technical discussions on gas deliveries (mmBTU represents 1 million of British thermal unit, which is equal to 1055 joules), - Topic 2 contains elements related to the California electricity crisis, in which Enron was involved, and which almost caused the bankruptcy of SCE-corp (Southern California Edison Corporation) early 2001, - Topic 3 is concerned with Enron activities in Afghanistan (Enron and the Bush administration were suspected to work secretly with Talibans up to a few weeks before the 9/11 attacks), - Topic 4 is about usual logistic issues (building equipment, computers, ...), - Topic 5 seems to refer to the financial and trading activities of Enron.

At this point, it is possible to go back to the meta-network or the unfold version of the network (click on the "Expand clusters" button at the bottom) to see which individuals were interested in the "Afghanistan" issue, for instance. It turns out that the persons involved in this issue are mostly in the purple group (cluster 4).

This short note has shown how the Linkage platform can be used to analyze, with a novel point of view, one of the most studied communication network, highlighting in an automatic way the main elements of the Enron scandal.