Usage

This data is available in the openintro package as the data frame email.

Description

These data represent incoming emails for the first three months of 2012 for David Diez’s Gmail Account, early months of 2012. All personally identifiable information has been removed.

Format

A data frame with 3921 observations on the following 21 variables.

Variable Description
spam Indicator for whether the email was spam.
to_multiple Indicator for whether the email was addressed to more than one recipient.
from Whether the message was listed as from anyone (this is usually set by default for regular outgoing email).
cc Indicator for whether anyone was CCed.
sent_email Indicator for whether the sender had been sent an email in the last 30 days.
time Time at which email was sent.
image The number of images attached.
attach The number of attached files.
dollar The number of times a dollar sign or the word “dollar” appeared in the email.
winner Indicates whether “winner” appeared in the email.
inherit The number of times “inherit” (or an extension, such as “inheritance”) appeared in the email.
viagra The number of times “viagra” appeared in the email.
password The number of times “password” appeared in the email.
num_char The number of characters in the email, in thousands.
line_breaks The number of line breaks in the email (does not count text wrapping).
format Indicates whether the email was written using HTML (e.g. may have included bolding or active links).
re_subj Whether the subject started with “Re:”, “RE:”, “re:”, or “rE:”
exclaim_subj Whether there was an exclamation point in the subject.
urgent_subj Whether the word “urgent” was in the email subject.
exclaim_mess The number of exclamation points in the email message.
number Factor variable saying whether there was no number, a small number (under 1 million), or a big number.

This information was copied from ?email on March 20, 2016