hadoop - ClickStream Data Analysis -
i new bigdata analysis , came across interesting scenario called clickstream data analysis. know clickstream data. know more , different scenarios in can used in best interests of business , set of tools need process data in different steps of each scenario.
any appreciated. thank you.
what clickstream data?
it virtual trail user leaves behind while surfing internet. clickstream record of user's activity on internet, including every web site , every page of every web site user visits, how long user on page or site, in order pages visited, newsgroups user participates in , e-mail addresses of mail user sends , receives. both isps , individual web sites capable of tracking user's clickstream.
clickstream data may include information like: browser height-width,browser name,browser language, device type (desktop,laptop,tablets,mobile),revenue,day,timestamp,ip address, url,number of products added in cart, number of products removed,state,country,billing zip code,shipping zip code,etc.
how can extract more information clickstream data?
in web analytics realm, site visitors , potential customers equivalent of subjects in subject-based data set. consider following clickstream data example, subject-based dataset structured in rows , columns (like excel spreadsheet) — each row of data set unique subject , each column piece of information subject. if want customer-based analysis, need customer based data set. in granular form, clickstream data looks chart below. hits same visitor have been color coded together.
data scientists derive more features clickstream data. each visitor, have several hits within visit, , on extended period of time have collection of visits. need way organize data @ visitor level. this:
obviously, there many different ways aggregate data. numeric data page views, revenue , video views, may want use average or total. doing more information customer behavior. if observe aggregated chart, can tell company making more revenue on friday.
once have obtained customer-based data set, there number of different statistical models , data science techniques can allow access deeper, more meaningful analysis @ visitor level. data science consulting has expertise , experience in leveraging these methods to:
predict customers @ highest risk churn , determine factors affecting risk (allows proactive in retaining customer base)
understand level of brand awareness of individual customers
target customers individualized, relevant offers
anticipate customers convert , statistically determine how site influencing decision
determine types of site content visitors respond , understand how content engagement drives high-value visits
define profiles , characteristics of different personas of visitors coming site, , understand how engage them.
you may interested in following coursera course:
it's on process mining, has click trace analysis special case, think.
Comments
Post a Comment