<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Dr. Jason Davis</title>
	<atom:link href="http://drjasondavis.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://drjasondavis.com</link>
	<description>Data Science Hacking</description>
	<lastBuildDate>Sun, 19 May 2013 16:05:03 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='drjasondavis.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://0.gravatar.com/blavatar/095dd04a59c8f03bd14460b595c587b2?s=96&#038;d=http%3A%2F%2Fs2.wp.com%2Fi%2Fbuttonw-com.png</url>
		<title>Dr. Jason Davis</title>
		<link>http://drjasondavis.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://drjasondavis.com/osd.xml" title="Dr. Jason Davis" />
	<atom:link rel='hub' href='http://drjasondavis.com/?pushpress=hub'/>
		<item>
		<title>A Corollary to ExperimentCalculator.com (with examples)</title>
		<link>http://drjasondavis.com/2013/05/19/a-corollary-to-experimentcalculator-com-with-examples/</link>
		<comments>http://drjasondavis.com/2013/05/19/a-corollary-to-experimentcalculator-com-with-examples/#comments</comments>
		<pubDate>Sun, 19 May 2013 16:05:02 +0000</pubDate>
		<dc:creator>Jason Davis</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://drjasondavis.com/?p=127</guid>
		<description><![CDATA[Dan McKinley recently put together a very useful tool in estimating how long to run your A/B tests. The obvious corollary here being, “your experiments will take much longer than you think”. Let’s dive into some real-world numbers. Adwords campaign &#8230; <a href="http://drjasondavis.com/2013/05/19/a-corollary-to-experimentcalculator-com-with-examples/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=drjasondavis.com&#038;blog=18274169&#038;post=127&#038;subd=drjasondavis&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a href="http://drjasondavis.files.wordpress.com/2013/05/34.jpg"><img class="alignnone size-full wp-image-134" alt="34" src="http://drjasondavis.files.wordpress.com/2013/05/34.jpg?w=440"   /></a></p>
<p>Dan McKinley recently put together <a href="http://www.experimentcalculator.com/">a very useful tool in estimating how long to run your A/B tests</a>.</p>
<p>The obvious corollary here being, “your experiments will take much longer than you think”.</p>
<p>Let’s dive into some real-world numbers.</p>
<p><strong>Adwords campaign optimization</strong></p>
<p>The scenario. You’re buying clicks from Google Adwords to get people to sign up for your startup’s new service. You just made some copy changes to the landing page which you’re hoping will improve signup conversion. Your base signup rate is 10%, and you expect your new changes to increase signup rate to 15% (a +50% increase!). You spend $0.50 per click with a budget of $100 per day, so your landing pages see a total of 200 visits each day.</p>
<p>The statistics. You’d have to run this campaign for <a href="http://www.experimentcalculator.com/#lift=50&amp;conversion=10&amp;visits=200&amp;percentage=50">8 days</a> and spend $800 to verify the changes. Alternately, if conversion rate increased to only 11% (a +10% change), then you’d have to spend $15,000 to verify the change.</p>
<p><strong>Ecommerce optimization (Etsy)</strong></p>
<p>The scenario. During a company hack week, a designer makes several changes to the cart page and wants to run a 1% experiment. The designer is quite bullish about the changes and thinks that it could in fact boost sales by 5% (!), or about $50 million from 2013’s expected sales of over $1 billion.</p>
<p><a href="http://drjasondavis.files.wordpress.com/2013/05/etsy_cart.png"><img class="alignnone size-full wp-image-133" alt="etsy_cart" src="http://drjasondavis.files.wordpress.com/2013/05/etsy_cart.png?w=440&#038;h=194" width="440" height="194" /></a></p>
<p>The statistics. According to their <a href="https://www.etsy.com/blog/news/2013/etsy-statistics-april-2013-weather-report/">blog</a>, Etsy sold over $100 million with of goods in April with almost 1.5 billion page views. Assuming standard e-commerce conversion rates of 4% (along with some other assumptions about average order size), this experiment would need to be run for over <a href="http://www.experimentcalculator.com/#lift=5&amp;conversion=4&amp;visits=567924&amp;percentage=1">3 years</a>! An experiment affecting 10% of users would require <em>only</em> two weeks.</p>
<p><b>My last startup (Adtuitive)</b></p>
<p>The scenario. We bought relatively cheap display ads on niche content sites and matched sku-level ads from our database of millions of products. Depending on placement and sites, click rates for us were sometimes around 0.1% (which believe it or not was a huge improvement over static banner ads). We were serving around 200 million ads a month, and we were releasing an algorithmic change that we thought might increase click rates (and our revenue) by 10% (!).</p>
<p><a href="http://drjasondavis.files.wordpress.com/2013/05/adtuitive_ad_example.png"><img class="alignnone size-full wp-image-132" alt="adtuitive_ad_example" src="http://drjasondavis.files.wordpress.com/2013/05/adtuitive_ad_example.png?w=440&#038;h=279" width="440" height="279" /></a></p>
<p>As the change was somewhat major, we didn’t want to roll it out to more than 10% of visits during our experiment.</p>
<p>The statistics. We would have had to run the experiment for <a href="http://www.experimentcalculator.com/#lift=10&amp;conversion=0.1&amp;visits=500000&amp;percentage=20&amp;power=80">39 days</a>. Our 200 million ads per month equated to 3 million per day, or about 500k visits per day (visitors view multiple ads). Running it at 50% would have required only 7 days.</p>
<p><strong>Takeaways</strong></p>
<p>Calling bullshit. Next time someone claims they increased their landing page conversion from 10% to 15%, you may want to question things. Exactly how many conversions are they dealing with? And how many separate changes did they make? Small changes are also harder to measure than larger ones.</p>
<p>Google’s famed 1% experiments really only work at Google scale. You’ll have to run your experiments at 10% or 50% levels. And of course, make sure you double check your statistics.</p>
<p>Opportunity cost. Experiments take more than just design and software to code up, they also take time to run and verify. So before restyling the checkout button, ask yourself if there are other parts of your core funnels or product that you’d be better off testing first.</p>
<p>Other reasons to test. Sometimes changes are necessary to accommodate future functionality or new strategic changes for the overall product. E.g. restyling the cart page to provide more whitespace for a future gift cards launch, or revamping the homepage to give  attention to some fledgling social aspects of your site. In these cases, even when you expect a 0% change (or even a negative change), testing is still important to understand impact. And of course, statistics still apply.</p>
<p>So, next time you&#8217;re planning to run an experiment, you may want to spend some time with <a href="http://experimentcalculator.com">Mr. ExperimentCalculator.com</a> first. Your intuition is most likely wrong.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/drjasondavis.wordpress.com/127/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/drjasondavis.wordpress.com/127/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=drjasondavis.com&#038;blog=18274169&#038;post=127&#038;subd=drjasondavis&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://drjasondavis.com/2013/05/19/a-corollary-to-experimentcalculator-com-with-examples/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/5ad800529ca815561e874237f707c810?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">jasonvdavis</media:title>
		</media:content>

		<media:content url="http://drjasondavis.files.wordpress.com/2013/05/34.jpg" medium="image">
			<media:title type="html">34</media:title>
		</media:content>

		<media:content url="http://drjasondavis.files.wordpress.com/2013/05/etsy_cart.png" medium="image">
			<media:title type="html">etsy_cart</media:title>
		</media:content>

		<media:content url="http://drjasondavis.files.wordpress.com/2013/05/adtuitive_ad_example.png" medium="image">
			<media:title type="html">adtuitive_ad_example</media:title>
		</media:content>
	</item>
		<item>
		<title>Restart</title>
		<link>http://drjasondavis.com/2012/12/09/restart/</link>
		<comments>http://drjasondavis.com/2012/12/09/restart/#comments</comments>
		<pubDate>Sun, 09 Dec 2012 14:34:16 +0000</pubDate>
		<dc:creator>Jason Davis</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://drjasondavis.com/?p=118</guid>
		<description><![CDATA[Etsy acquired my startup Adtuitive in 2009. At the time, we had a pretty cool product that automated online advertising for small retailers, and we were operating at a modest scale of 200 million ads a month. Deciding to sell &#8230; <a href="http://drjasondavis.com/2012/12/09/restart/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=drjasondavis.com&#038;blog=18274169&#038;post=118&#038;subd=drjasondavis&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a href="http://www.etsy.com/listing/115943403/having-a-bad-hair-day-vintage-pink-gun"><img class="alignnone size-full wp-image-119" alt="gun hairdryer" src="http://drjasondavis.files.wordpress.com/2012/12/gun-hairdryer.jpg?w=440&#038;h=440" height="440" width="440" /></a></p>
<p>Etsy acquired my startup Adtuitive in 2009. At the time, we had a pretty cool product that automated online advertising for small retailers, and we were operating at a modest scale of 200 million ads a month.</p>
<p>Deciding to sell the company was tough, but the last three years at Etsy were awesome. I had the privilege of working with very talented folks across a full stack of things, from Hadoop infrastructure to search ranking to search UI. And of course, search ads. During this time period, I saw Etsy grow from <a href="http://www.ecommercebytes.com/cab/cab/abn/y10/m01/i15/s01">$180 million in 2009</a> sales, to over <a href="http://www.etsy.com/blog/news/2012/etsy-statistics-october-2012-weather-report/">$80 million last October</a> alone. My team grew from Adtuitive’s engineering team of only 4 to almost 30 in total.</p>
<p>I’ll miss my time at Etsy, but I’m an entrepreneur at heart, and it’s time to start over. I’ll be taking some time off in the upcoming months &#8211; time off from work, time off from management, time off from NYC life. I look forward to writing code again and thinking about real world problems at a fundamental and disruptive level. I’ll be back up running at 110% sometime early to mid next year.</p>
<p>Stay or get back in touch with me at jvdavis ‘at’ gmail.com.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/drjasondavis.wordpress.com/118/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/drjasondavis.wordpress.com/118/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=drjasondavis.com&#038;blog=18274169&#038;post=118&#038;subd=drjasondavis&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://drjasondavis.com/2012/12/09/restart/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/5ad800529ca815561e874237f707c810?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">jasonvdavis</media:title>
		</media:content>

		<media:content url="http://drjasondavis.files.wordpress.com/2012/12/gun-hairdryer.jpg" medium="image">
			<media:title type="html">gun hairdryer</media:title>
		</media:content>
	</item>
		<item>
		<title>NYC Dining: The cost of a &#8220;B&#8221;</title>
		<link>http://drjasondavis.com/2012/08/26/nyc-dining-the-cost-of-a-b/</link>
		<comments>http://drjasondavis.com/2012/08/26/nyc-dining-the-cost-of-a-b/#comments</comments>
		<pubDate>Sun, 26 Aug 2012 22:21:04 +0000</pubDate>
		<dc:creator>Jason Davis</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://drjasondavis.com/?p=92</guid>
		<description><![CDATA[If you eat out in New York City, the image above should evoke some sort of visceral reaction. In July of 2010, the NYC Department of Health began rating each of the  24,000 restaurants throughout the five boroughs of the &#8230; <a href="http://drjasondavis.com/2012/08/26/nyc-dining-the-cost-of-a-b/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=drjasondavis.com&#038;blog=18274169&#038;post=92&#038;subd=drjasondavis&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><img class="alignnone" title="NYC Inspection Rating - B" src="http://nycprgirls.com/wp-content/uploads/2011/01/sanitary-inspections.jpg" alt="" width="420" height="316" /></p>
<p>If you eat out in New York City, the image above should evoke some sort of visceral reaction. In July of 2010, the NYC Department of Health began rating each of the  24,000 restaurants throughout the five boroughs of the city. Each restaurant is given a grade of &#8220;A&#8221;, &#8220;B&#8221;, or &#8220;C&#8221; based on violations ranging from improper food temperature to sewage problems to the presence of vermin. You can browse  the complete list <a href="http://www.nyc.gov/html/doh/downloads/pdf/rii/self-inspection-worksheet.pdf">here</a>.</p>
<p>Fast forward 2 years, and the new system seems to be a win for consumers &#8211; <a href="http://www.nyc.gov/portal/site/nycgov/menuitem.c0935b9a57bb4ef3daf2f1c701c789a0/index.jsp?pageID=mayor_press_release&amp;catID=1194&amp;doc_name=http%3A%2F%2Fwww.nyc.gov%2Fhtml%2Fom%2Fhtml%2F2012a%2Fpr076-12.html&amp;cc=unused1978&amp;rc=1194&amp;ndi=1">Mayor Bloomberg credits the program</a> to a 14% reduction in Salmonella, the lowest rate in 20 years. And according to this press release, NYC restaurant revenue is also up 9.3% since grading began. But still many <a href="http://cityroom.blogs.nytimes.com/2012/03/07/restaurateurs-voice-anger-over-health-inspections/">restauranteurs disagree</a>, expressing anger over these health inspections. Restaurants complain about the complexity in understanding the grading system, fighting with the city over infraction points, and spending additional money to maintain their facilities to meet the city&#8217;s guidelines.</p>
<p>But clearly the biggest cost associated with the city&#8217;s program is the fear of a &#8220;B&#8221; rating, or, even worse, an unmentionable &#8220;C&#8221; rating.</p>
<h2>Just how costly is a &#8220;B&#8221;?</h2>
<p>To quantify these costs, I correlated NYC restaurant inspection rating changes with their restaurant ratings on the popular review site, <a href="http://www.yelp.com">Yelp</a>. Starting with the most popular 1000 restaurants in Manhattan on Yelp, I crawled each of their review pages, extracted ratings for each restaurant. NYC health inspection ratings are available via <a href="https://nycopendata.socrata.com/">NYC&#8217;s OpenData initiative</a>, and each of these top Yelp restaurants were then correlated with their corresponding health code ratings. All code is available on GitHub under my <a href="https://github.com/drjasondavis/Nyc-Restaurant-Inspection-Analysis">Nyc Restaurant Inspection Project</a>, along with a <a href="https://raw.github.com/drjasondavis/Nyc-Restaurant-Inspection-Analysis/master/data/reviews_and_inspections.csv">csv</a> that contains joined Yelp restaurant reviews with their corresponding inspection ratings.</p>
<p>According to the Mayor&#8217;s argument, Salmonella cases have gone down since restaurant inspection ratings have, on average, increased since the start of the program. The Mayor&#8217;s report claims that the number of &#8220;A&#8221; ratings has increased form 65% to 72% of all restaurants since the start of the program. And within the set of top Manhattan restaurants analyzed here, trends are similar. The plot below shows average rating inspection value since July 2010 (5.0 represents &#8220;A&#8221;, 4.0 &#8220;B&#8221;, etc):</p>
<p><a href="http://drjasondavis.files.wordpress.com/2012/08/inspection_ratings_avg_over_time.png"><img class="alignnone size-full wp-image-96" title="inspection_ratings_avg_over_time" src="http://drjasondavis.files.wordpress.com/2012/08/inspection_ratings_avg_over_time.png?w=440&#038;h=330" alt="" width="440" height="330" /></a></p>
<p>Looking at average Yelp reviews since 2005, we can see that the time period since August 2010 is relatively stable, hovering between 3.8 and 3.9.</p>
<p><a href="http://drjasondavis.files.wordpress.com/2012/08/yelp_ratings_avg_over_time.png"><img class="alignnone size-full wp-image-95" title="yelp_ratings_avg_over_time" src="http://drjasondavis.files.wordpress.com/2012/08/yelp_ratings_avg_over_time.png?w=440&#038;h=330" alt="" width="440" height="330" /></a></p>
<p>To get a better sense of how ratings are impacted by inspection grades, let&#8217;s look at restaurant grade changes (&#8220;A&#8221; to &#8220;B&#8221;, &#8220;A&#8221; to &#8220;C&#8221;) and see how their yelp ratings in the 60 days before and after changed:</p>
<table>
<tbody>
<tr>
<th><strong>Change   </strong></th>
<th><strong>Rating Before   </strong></th>
<th><strong>Rating After   </strong></th>
<th><strong>Delta  </strong></th>
</tr>
<tr>
<td>A -&gt; C</td>
<td>3.94</td>
<td>3.68</td>
<td>-6.7%</td>
</tr>
<tr>
<td>B -&gt; C</td>
<td>3.86</td>
<td>3.69</td>
<td>-4.6%</td>
</tr>
<tr>
<td>A -&gt; B</td>
<td>3.77</td>
<td>3.76</td>
<td>-0.3%</td>
</tr>
</tbody>
</table>
<p>Restaurants downgraded to a &#8220;C&#8221; rating received significantly lower Yelp ratings in the month after the downgrade, but restaurants receiving a &#8220;B&#8221; rating were relatively unaffected in their review quality.</p>
<p>So restaurants with &#8220;C&#8221; ratings tend to have a lower review quality on Yelp, but do lower ratings deter people from dining at a restaurant in the first place? Looking at overal review counts for 60 day periods before and after rating changes:</p>
<table>
<tbody>
<tr>
<th><strong>Change   </strong></th>
<th><strong>Count Before   </strong></th>
<th><strong>Count After   </strong></th>
<th><strong>Delta  </strong></th>
</tr>
<tr>
<td>A -&gt; C</td>
<td>167</td>
<td>157</td>
<td>-6.0%</td>
</tr>
<tr>
<td>B -&gt; C</td>
<td>214</td>
<td>230</td>
<td>+7.5%</td>
</tr>
<tr>
<td>A -&gt; B</td>
<td>724</td>
<td>699</td>
<td>-3.5%</td>
</tr>
</tbody>
</table>
<p>The increase in review counts in &#8220;B&#8221; to &#8220;C&#8221; downgrades is most likely due to the data being somewhat thin. Across all three downgrades, Yelp review counts as well as rating counts showed average decreases of almost 2%.</p>
<h2>Takeaways</h2>
<p>A <a href="http://www.hbs.edu/research/pdf/12-016.pdf">recent study by Michael Luca</a> found that increased Yelp review rating quality can lead to increased revenue. Among other things, the study also found that independently owned restaurants were much more affected by these reviews than ones with chain affiliations. Many of Manhattan&#8217;s top restaurants analyzed here are independent, and the decrease in Yelp ratings here undoubtedly also corresponds to lost revenue.</p>
<p>An interesting question to consider is one of causation: one goal of the inspection program is to improve sanitary conditions at restaurants in NYC. When a restaurant transitions from an &#8220;A&#8221; rating to a &#8220;C&#8221; rating, the only change we can say for certain is the letter grade posted outside the front door. In the days and weeks following a downgrade, one would expect restaurants to actually clean up their sanitary conditions. So, during the time period analyzed here, sanitary conditions before the downgrade are probably worse than after.</p>
<p>Of course, the other goal of NYC&#8217;s inspection process is to increase consumer awareness. And consumers seem to notice: when restaurants are downgraded, the costs are measurable.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/drjasondavis.wordpress.com/92/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/drjasondavis.wordpress.com/92/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=drjasondavis.com&#038;blog=18274169&#038;post=92&#038;subd=drjasondavis&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://drjasondavis.com/2012/08/26/nyc-dining-the-cost-of-a-b/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/5ad800529ca815561e874237f707c810?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">jasonvdavis</media:title>
		</media:content>

		<media:content url="http://nycprgirls.com/wp-content/uploads/2011/01/sanitary-inspections.jpg" medium="image">
			<media:title type="html">NYC Inspection Rating - B</media:title>
		</media:content>

		<media:content url="http://drjasondavis.files.wordpress.com/2012/08/inspection_ratings_avg_over_time.png" medium="image">
			<media:title type="html">inspection_ratings_avg_over_time</media:title>
		</media:content>

		<media:content url="http://drjasondavis.files.wordpress.com/2012/08/yelp_ratings_avg_over_time.png" medium="image">
			<media:title type="html">yelp_ratings_avg_over_time</media:title>
		</media:content>
	</item>
		<item>
		<title>An Insider&#8217;s Guide to Facebook&#8217;s IPO</title>
		<link>http://drjasondavis.com/2012/05/13/an-insiders-guide-to-facebooks-ipo/</link>
		<comments>http://drjasondavis.com/2012/05/13/an-insiders-guide-to-facebooks-ipo/#comments</comments>
		<pubDate>Sun, 13 May 2012 13:35:41 +0000</pubDate>
		<dc:creator>Jason Davis</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://drjasondavis.com/?p=86</guid>
		<description><![CDATA[The Wall Street Journal recently had a piece on investing in the Facebook IPO. They admit, “most retail investors will be shut out of the offering and won&#8217;t get the IPO price, meaning they likely will have to pay more &#8230; <a href="http://drjasondavis.com/2012/05/13/an-insiders-guide-to-facebooks-ipo/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=drjasondavis.com&#038;blog=18274169&#038;post=86&#038;subd=drjasondavis&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><img class="alignnone" title="Insiders" src="http://www.bubbleinfo.com/wp-content/uploads/2012/02/insider_trading.jpg" alt="" width="350" height="285" /></p>
<p>The Wall Street Journal recently had a piece on <a href="http://online.wsj.com/article_email/SB10001424052702304543904577394113850246518-lMyQjAxMTAyMDEwMTExNDEyWj.html">investing in the Facebook IPO</a>. They admit, “most retail investors will be shut out of the offering and won&#8217;t get the IPO price, meaning they likely will have to pay more in the days that follow if they want an early piece of the action”.</p>
<p>To see what’s going on here, let’s take a closer look at what happens on the day of the IPO. In pricing the IPO, there are two prices to consider: the offer price, and the open price. The offer price is set by the company and underwriters. This is the stock price that the company receives in its IPO sale. The open price is set by the publicly traded market on the day the company goes public.</p>
<p>As an example, Linkedin went public in May of 2011. Their offering price was $45 per share, and the stock’s open price was $83 per share. On opening day, the stock went up to over $120 per share, and had a low of $80 per share. Linkedin sold approximately 7.8 million shares of stock in its IPO, so on opening day the value of these shares ranged from $624 million (at $80 per share) to $957 million ($122.70 per share). Had Linkedin and their underwriters set the offer price to closer to $80 per share, Linkedin could have made an additional $273 million during its IPO sale.</p>
<p>Let’s consider the difference in offer price vs open price among the six tech companies mentioned in the WSJ article:</p>
<p><span style="font-family:monospace;"><br />
<strong>Company     Offer   Open (low)  Raised    Loss</strong><br />
Linkedin    $45     $80.00      $351M     $273M<br />
Groupon     $20     $26.11      $700M     $214M<br />
Yandex      $25     $30.55      $1.4B     $310M<br />
Zynga       $10     $9.00       $1B       ($100M)<br />
Renren      $14     $12.30      $743M     ($90M)<br />
Pandora     $16     $17.35      $235M     $20M<br />
<strong>TOTAL                           $4.4B     $627M</strong></span></p>
<p>Had these six company and their underwriters set the offer price closer to the open price, they could have raised an additional $627 million. This value was instead realized by institutional investors and select individuals who were able to participate in the IPO at offering price. They bought a total of $4.43 billion in stock at the IPO, and this stock immediately increased to $5.06 billion when the stock traded publicly that very same day.</p>
<p>Unfortunately, as the WSJ says, most retail investors have no access to IPOs at the offer price. As for the company, they have no access to IPOs at the open price. So who’s getting really getting rich off of these IPOs? The underwriters and their insiders.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/drjasondavis.wordpress.com/86/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/drjasondavis.wordpress.com/86/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=drjasondavis.com&#038;blog=18274169&#038;post=86&#038;subd=drjasondavis&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://drjasondavis.com/2012/05/13/an-insiders-guide-to-facebooks-ipo/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/5ad800529ca815561e874237f707c810?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">jasonvdavis</media:title>
		</media:content>

		<media:content url="http://www.bubbleinfo.com/wp-content/uploads/2012/02/insider_trading.jpg" medium="image">
			<media:title type="html">Insiders</media:title>
		</media:content>
	</item>
		<item>
		<title>3 Day Startup NYC: Day 1 Mentoring</title>
		<link>http://drjasondavis.com/2012/04/21/3-day-startup-nyc-day-1-mentoring/</link>
		<comments>http://drjasondavis.com/2012/04/21/3-day-startup-nyc-day-1-mentoring/#comments</comments>
		<pubDate>Sat, 21 Apr 2012 15:08:51 +0000</pubDate>
		<dc:creator>Jason Davis</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://drjasondavis.com/?p=68</guid>
		<description><![CDATA[I love innovation. I love working with smart people. And I love working with limited resources around extremely tight deadlines. 3 Day Startup: 30 bright entrepreneurs. Ideas flow Friday night, business plan and product demo need to happen by Sunday &#8230; <a href="http://drjasondavis.com/2012/04/21/3-day-startup-nyc-day-1-mentoring/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=drjasondavis.com&#038;blog=18274169&#038;post=68&#038;subd=drjasondavis&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a href="http://drjasondavis.files.wordpress.com/2012/04/3ds.jpeg"><img class="alignnone size-full wp-image-69" title="3ds" src="http://drjasondavis.files.wordpress.com/2012/04/3ds.jpeg?w=440&#038;h=247" alt="" width="440" height="247" /></a></p>
<p>I love innovation. I love working with smart people. And I love working with limited resources around extremely tight deadlines.</p>
<p>3 Day Startup: 30 bright entrepreneurs. Ideas flow Friday night, business plan and product demo need to happen by Sunday night.</p>
<p>The program started out of UT Austin, and I attended as a participant at their first event. I actually pitched the idea that the group ended up building that weekend, which ended up spinning out into a company now called <a href="http://moodfish.com/">Moodfish</a>. Nik has since taken Moodfish 1000x further, well beyond the simple idea I had. 3DS has grown tremendously since and is now a worldwide operation. They&#8217;ve held events in Germany, Spain, The Netherlands, France, Portugal, Israel, Chile, and China.</p>
<p>This weekend marks their first ever event in New York City. I had the privilege of working with a very talented team of individuals last night as a mentor. And Etsy also graciously <a href="http://nyc.3daystartup.org/sponsors/">sponsored the event</a>.</p>
<p>As a mentor, I really enjoyed the discussions I had with these young entrepreneurs and technologists. Some common themes / feedback I gave:</p>
<ul>
<li>For inexperienced / first time entrepreneurs, ideas motivated by a problem that they&#8217;ve experienced first hand are always the best. Once this problem has been identified, the next question to answer is whether building a product can solve this problem, and if that product can support a business.</li>
<li>Undervaluing current standards. Email, SMS, Craigslist, Post-it Notes: these are all established standards, and their simplicity and pervasiveness is what makes them awesome. When talking innovation, it&#8217;s easy to get excited; remember these standards.</li>
<li>Maintaining scope and focus. The minimum viable startup (or product) is critical on so many levels: conveying the true value / pitching the idea to others, understanding the business, maintaining focus on execution, and minimizing your technological needs. Every additional component to the business (&#8220;and, we&#8217;ll give 3% to charity&#8221;) adds a tremendous amount of complexity to the business. Start simple.</li>
</ul>
<p>As a sponsor, I find 3DS as a great way to meet people in a realistic and high-stress setting. As a 3DS participant, a successful weekend requires many real-world skills:</p>
<ul>
<li>The ability to collaborate with people from totally different backgrounds. E.g. an engineer having a discussion with a marketing person about developing a landing page.</li>
<li>For engineers, comfort with a workbench, and the ability to rapidly prototype.</li>
<li>Execution. Having something to show by the end of the weekend. Finding a balance between working hard, working fast, and building simple and iterating.</li>
</ul>
<p>Final presentations are on Sunday. I&#8217;ll be serving on the panel. Get tickets for the <a href="http://www.eventbrite.com/event/2836898239">final presentations on EventBrite</a>.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/drjasondavis.wordpress.com/68/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/drjasondavis.wordpress.com/68/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=drjasondavis.com&#038;blog=18274169&#038;post=68&#038;subd=drjasondavis&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://drjasondavis.com/2012/04/21/3-day-startup-nyc-day-1-mentoring/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/5ad800529ca815561e874237f707c810?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">jasonvdavis</media:title>
		</media:content>

		<media:content url="http://drjasondavis.files.wordpress.com/2012/04/3ds.jpeg" medium="image">
			<media:title type="html">3ds</media:title>
		</media:content>
	</item>
		<item>
		<title>Lending Club Loan Analysis: Making Money with Logistic Regression</title>
		<link>http://drjasondavis.com/2012/04/08/lending-club-loan-analysis-making-money-with-logistic-regression/</link>
		<comments>http://drjasondavis.com/2012/04/08/lending-club-loan-analysis-making-money-with-logistic-regression/#comments</comments>
		<pubDate>Sun, 08 Apr 2012 15:57:44 +0000</pubDate>
		<dc:creator>Jason Davis</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://drjasondavis.com/?p=49</guid>
		<description><![CDATA[The Lending Club is an online marketplace for loans. As a borrower, you can apply for a loan, and if accepted, your loan gets listed in the marketplace. As an investor, you can browse loans in the marketplace, and invest &#8230; <a href="http://drjasondavis.com/2012/04/08/lending-club-loan-analysis-making-money-with-logistic-regression/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=drjasondavis.com&#038;blog=18274169&#038;post=49&#038;subd=drjasondavis&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a href="http://drjasondavis.files.wordpress.com/2012/04/lc_blogpost_image.jpg"><img class="alignnone size-full wp-image-52" title="lc_blogpost_image" alt="" src="http://drjasondavis.files.wordpress.com/2012/04/lc_blogpost_image.jpg?w=440"   /></a></p>
<p>The <a href="http://http://lendingclub.com/">Lending Club</a> is an online marketplace for loans. As a borrower, you can apply for a loan, and if accepted, your loan gets listed in the marketplace. As an investor, you can browse loans in the marketplace, and invest in individual loans at your discretion. This peer to peer model has many advantages over traditional banking counterparts, for example, lower overhead costs, lower cost of capital, etc.</p>
<p>But what excites me the most about peer to peer lending is the democratization of data. As an investor, you can see each and every rejected, completed, ongoing, and available loan. While loan data excludes personally identifiable information, it does include attributes like credit rating, location, college education level, lines of credit, and descriptions of why the applicant needs the loan.</p>
<p>For your average investor who doesn’t have the sophistication (or time) to sift through tens of thousands of reviews, the Lending Club provides tools to find loans based on one’s risk and diversification goals. Being a data geek, I of course immediately downloaded the full dataset.</p>
<p>One of the first things I noticed was that many loans have fairly long descriptions:</p>
<blockquote><p>&#8220;Dear Lenders, I was involved in a sports injury approximately 18 months ago…..Thank you for taking time to read this letter.  Thank you&#8221;</p></blockquote>
<p>While this borrower is clearly in an unfortunate situation (the full text was over 1500 characters in length), it appears as if borrowers who write longer descriptions actually have much higher default rates:</p>
<p><a href="http://drjasondavis.files.wordpress.com/2012/04/description_status.png"><img class="alignnone size-medium wp-image-50" title="description_status" alt="" src="http://drjasondavis.files.wordpress.com/2012/04/description_status.png?w=300&#038;h=228" width="300" height="228" /></a></p>
<p>So, is it possible to aggregate across several attributes with the goal of improving upon Lending Club&#8217;s basic investment strategies?</p>
<p>The basic problem to be solved here is one of predicting loan default rate. Given a loan with an interest rate of 12% and another loan with an interest rate of 16%, the expected loan default rate of each loan will tell me my expected return. For example, if the first loan has an expected default rate of 25%, and the second a rate of 50%, then my expected interest rates from the loan would be 9% and 8%, respectively. I’d be better off investing in the first loan.</p>
<p>The Lending Club’s analysis tools model default risk solely as a function of a single attribute, credit grade. I built a logistic regression model that optimizes over twelve different attributes including loan size, interest rate, application date, debt to income ratio, home ownership status, and description length.</p>
<p>The model was trained over earliest 50% of loans issued and evaluated over the other half. For each loan, I predict expected default rate and use this to predict the expected interest rate for the loan. Loans are then sorted by highest expected rate. The following shows actual interest rate for investments in the best 40 loans with highest predicted interest rates, up to investments in the best 1000 loans:</p>
<p><a href="http://drjasondavis.files.wordpress.com/2012/04/loans_invested.png"><img class="alignnone size-full wp-image-51" title="loans_invested" alt="" src="http://drjasondavis.files.wordpress.com/2012/04/loans_invested.png?w=440&#038;h=330" width="440" height="330" /></a></p>
<p>For investments over a smaller number of loans (fewer than 400), the logistic regression model clearly outperforms the others. Credit grade binning computes risk as the average default rate of each credit grade, and the final method assumes a default rate of zero for all loans (i.e. just invest in loans with the highest interest rate first).</p>
<p>To get a better idea of sensitivity, for each of the twelve attributes used to train the model, I trained a new model that held out one attribute and used the remaining eleven attributes to train a new model. I then computed expected interest rate for an investment of 80 loans. Resulting interest rate reductions for each attribute are as follows:</p>
<p><code>Attribute:            Interest rate reduction<br />
amount requested      0.83%<br />
fico range            0.39%<br />
application date      0.35%<br />
earliest credit line  0.31%<br />
interest rate         0.26%<br />
open credit lines     0.26%<br />
total credit lines    0.06%<br />
home ownership        0.04%<br />
credit grade          0.04%<br />
debt to income ratio  0.04%<br />
description length    0.04%<br />
monthly income        0.00%<br />
</code></p>
<p>According to this analysis, the amount requested for a loan is the most important single attribute in the logistic regression model; interest rate drops by 0.83% if this attribute is omitted. On the other hand, description length is relatively unimportant in terms of model sensitivity. This is due to the fact that most loans actually have relatively short descriptions.</p>
<p>Surprisingly, application date is actually quite important to the model. However, when investing in a loan, this isn&#8217;t a factor that you can really optimize over, e.g., you can&#8217;t invest in a loan issued in 2007, nor can you invest in a loan in the future that someone hasn&#8217;t yet applied for. It appears as if the Lending Club&#8217;s loan approvals have trended towards riskier loans with higher interest rates:</p>
<p><a href="http://drjasondavis.files.wordpress.com/2012/04/application_date_x_interest_rate.png"><img class="alignnone size-full wp-image-56" title="application_date_x_interest_rate" alt="" src="http://drjasondavis.files.wordpress.com/2012/04/application_date_x_interest_rate.png?w=440&#038;h=330" width="440" height="330" /></a></p>
<p>So, what’s the catch? Why am I blogging here instead of just quietly investing?</p>
<ul>
<li>I do invest in lending club loans, and I will be incorporating my analysis here into my investment strategy.</li>
<li>There is of course much more complexity to this problem than I’m presenting. In particular, my model invests in loans with the highest expected return and doesn’t have any real risk model beyond this. I ignore all macroscopic effects.</li>
<li>Perhaps the biggest risk of all is if the Lending Club were to go out of business.</li>
<li>There are lots of details about my analysis that I haven&#8217;t described here. All code can be found on github: <a href="https://github.com/drjasondavis/Lending-Club-Learning">https://github.com/drjasondavis/Lending-Club-Learning</a>.</li>
<li>There&#8217;s a ton more work to be done here: incorporating semantic analysis of descriptions, education information about borrowers, etc.</li>
<li>The Lending Club assesses collection fees for loans that are passed due. It&#8217;s not 100% clear how these fees are applied, but it probably makes investing in riskier loans less appealing than the models presented here suggest. See more information here under &#8220;Investor Fees&#8221;: <a href="http://www.lendingclub.com/public/rates-and-fees.action">http://www.lendingclub.com/public/rates-and-fees.action</a></li>
<li>As @jderick points out in the comments, this analysis doesn&#8217;t accurately account for the cost of capital, which is higher for riskier loans with larger default rates.</li>
<li>I’m generally very bullish when it comes to online marketplaces, so I’m excited to share my findings.</li>
</ul>
<p><em>Disclaimer: I am not an investment professional. I do not warrant any information supplied here. Invest at your own risk!</em></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/drjasondavis.wordpress.com/49/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/drjasondavis.wordpress.com/49/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=drjasondavis.com&#038;blog=18274169&#038;post=49&#038;subd=drjasondavis&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://drjasondavis.com/2012/04/08/lending-club-loan-analysis-making-money-with-logistic-regression/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/5ad800529ca815561e874237f707c810?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">jasonvdavis</media:title>
		</media:content>

		<media:content url="http://drjasondavis.files.wordpress.com/2012/04/lc_blogpost_image.jpg" medium="image">
			<media:title type="html">lc_blogpost_image</media:title>
		</media:content>

		<media:content url="http://drjasondavis.files.wordpress.com/2012/04/description_status.png?w=300" medium="image">
			<media:title type="html">description_status</media:title>
		</media:content>

		<media:content url="http://drjasondavis.files.wordpress.com/2012/04/loans_invested.png" medium="image">
			<media:title type="html">loans_invested</media:title>
		</media:content>

		<media:content url="http://drjasondavis.files.wordpress.com/2012/04/application_date_x_interest_rate.png" medium="image">
			<media:title type="html">application_date_x_interest_rate</media:title>
		</media:content>
	</item>
		<item>
		<title>Questions on Startups from Luke Carrière</title>
		<link>http://drjasondavis.com/2012/02/11/questions-on-startups-from-luke-carriere/</link>
		<comments>http://drjasondavis.com/2012/02/11/questions-on-startups-from-luke-carriere/#comments</comments>
		<pubDate>Sat, 11 Feb 2012 18:07:23 +0000</pubDate>
		<dc:creator>Jason Davis</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://drjasondavis.wordpress.com/?p=44</guid>
		<description><![CDATA[When I was working on my PhD in Austin I got involved with a group called 3 Day Startup (3DS).  The idea is simple: get 40 bright, motivated, and entrepreneurial students in a room for 3 days, and have them build something. &#8230; <a href="http://drjasondavis.com/2012/02/11/questions-on-startups-from-luke-carriere/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=drjasondavis.com&#038;blog=18274169&#038;post=44&#038;subd=drjasondavis&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>When I was working on my PhD in Austin I got involved with a group called <a href="http://3daystartup.org/">3 Day Startup</a> (3DS).  The idea is simple: get 40 bright, motivated, and entrepreneurial students in a room for 3 days, and have them build something.</p>
<p>3DS has grown since I attended their first 3 day event, and they now hold these events world wide.  They&#8217;ll be holding their first <a href="http://nyc.3daystartup.org/">3DS in NYC on April 20</a>, and I&#8217;ll be helping out (as an advisor this time and a sponsor through Etsy). <a href="http://www.lukecarriere.com/">Luke Carrière</a> is organizing the event, and he sent me a list of questions about my startup experience, answered here.</p>
<div><strong>What is your advice to future entrepreneurs?<br />
</strong><br />
Do what you love, and start a company if that’s your passion. Entrepreneurship is about value creation, disruption of current standards, and ownership.<br />
<strong><strong><br />
How did you recognize the opportunity/research the feasibility of the idea?</strong></strong>&nbsp;</p>
<p>A successful startup has 99% due with execution: there are bad ideas that of course will never go anywhere, but “good” ideas hold very little weight on their own, IMO.<br />
<strong><strong><br />
My previous startup came about through a project I had been previously working on. I was trying to monetize a search engine I’d built and saw a market gap in online retail ads.</strong></strong>&nbsp;</p>
<p>Recognizing a successful opportunity also involves understanding the strengths of the people who will be working on it (i.e. the founding team).<br />
<strong><strong><br />
How did you finance your business?</strong></strong>&nbsp;</p>
<p>Mostly boot strapped. The internet is a unique place in that production costs are primarily just development time: if you don’t pay yourself, your production costs are zero. We built our initial system on a few machines on Amazon Web Services then scaled up to dozens as we brought on customers.<br />
<strong><br />
What was your growth strategy and why?<br />
</strong><br />
We were primarily a B2B service provider, so our growth strategy ultimately centered around sales.<br />
<strong><br />
What are you willing to give up?<br />
</strong><br />
Ownership was one of my three reasons for being an entrepreneur. Company ownership comes in the form of equity and control, and I would sooner give up equity than control.<br />
<strong><br />
What is your favorite aspect of being an entrepreneur?<br />
</strong><br />
Building cool stuff, re-thinking the way things work, and not having a boss.  Basically a restatement of my reasons for being an entrepreneur.What is your least favorite aspect of being an entrepreneur?<br />
Stress. With ownership and control comes responsibility. If shit hits the fan, you have no one to blame but yourself.<br />
<strong><strong><br />
What have you sacrificed?</strong></strong>&nbsp;</p>
<p>Nothing.What is the number of companies you have started?</p>
<p>I’ve worked on several large and substantial projects of my own, but have only started one company with employees, investors, cash flow, etc.&nbsp;</p>
<p><strong>What are lessons that you have learned from starting these companies?</strong>&nbsp;</p>
<p>As your company grows, it’s important to stay focused on the core problems you’re trying to solve. Functions like raising money, HR, business operations, etc. require attention, but must be a second priority.<br />
<strong><br />
</strong><strong>What are the challenges/obstacles you have faced?</strong>&nbsp;</p>
<p>We were presented with an awesome early stage acquisition opportunity 9 months after our series A round. We ended up taking it. I now lead the search and data team at Etsy.com where I’m very happy and work with a great team on interesting problems.<br />
<strong><br />
</strong>I still think there are huge opportunities in the market we were tackling.<br />
<strong><strong><br />
What are some regrets? Biggest mistakes?</strong></strong>&nbsp;</p>
<p>No regrets.What is you’re background information? (Education, previous job experience, etc.)</p>
<p>Ph.D. in data mining. Ex-Google. Lots of hacking on various projects and consulting.<br />
<strong><strong><br />
What is your business structure?</strong></strong>&nbsp;</p>
<p>S-Corp, I think. Whatever my lawyers recommended.<br />
<strong><strong><br />
What resources did you use when starting your company?</strong></strong>&nbsp;</p>
<p>Good advisors are critical, especially for first time entrepreneurs. We had a great advisor.<br />
<strong><strong><br />
How long did it take your initial idea to actually launch it?</strong></strong>&nbsp;</p>
<p>I worked on some of the technology for quite some time. It took maybe 6 weeks to build our first functional system prototype.<br />
<strong><strong><br />
Where do original investors stand now, if at all in your company?</strong></strong>&nbsp;</p>
<p>Everyone was happy with our acquisition, including our investors.<br />
<strong><br />
How do you manage your time?<br />
</strong><br />
These days I manage a fairly large team (much bigger than my entire startup). I wake up early and try to get all “real work” done before 11am.What did you do with initial profits?<br />
Left in the bank, paid for server bills.How long did it take for your company to become profitable?<br />
I think we were profitable for a brief period of time right before we raised our series A. Then we hired some awesome people and were no longer profitable <img src='http://s0.wp.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> .How did your idea change throughout the process?</p>
<p>The core idea stayed in tact, but we learned about how to position it, package it, sell it.<br />
<strong><br />
Did you ever think of giving up? If so why?<br />
</strong><br />
No.What was your initial role? What is your current role in the company now?<br />
I was the first founder and CEO.What is the worst advice you have ever received and why?<br />
Lawyers generally give very bad advice. They’re generally very smart people, but they don’t understand business relationships behind the contracts and deals they work on. Business starts with people first and contracts second: legal advice only concerns the latter.Which part of your job is actual work opposed to passion?<br />
I’m an engineering director now at a 300 person company. I try to spend as much time as possible working on things that are my passion, and I try to impress my entrepreneurial way of working on my team and larger company culture.How is the economy effecting your business?</p>
<p>A bad economy is a great opportunity for disruption. We closed our series A during March of 2008 when the Dow was at it’s lowest point in 10 years.</p></div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/drjasondavis.wordpress.com/44/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/drjasondavis.wordpress.com/44/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=drjasondavis.com&#038;blog=18274169&#038;post=44&#038;subd=drjasondavis&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://drjasondavis.com/2012/02/11/questions-on-startups-from-luke-carriere/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/5ad800529ca815561e874237f707c810?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">jasonvdavis</media:title>
		</media:content>
	</item>
		<item>
		<title>Why ad networks should optimize for precision and not recall</title>
		<link>http://drjasondavis.com/2012/01/30/why-ad-networks-should-optimize-for-precision-and-not-recall/</link>
		<comments>http://drjasondavis.com/2012/01/30/why-ad-networks-should-optimize-for-precision-and-not-recall/#comments</comments>
		<pubDate>Mon, 30 Jan 2012 15:14:46 +0000</pubDate>
		<dc:creator>Jason Davis</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://drjasondavis.wordpress.com/?p=33</guid>
		<description><![CDATA[When I was working on my PhD living in Austin, I owned several motorcycles, and spent lots of time online researching parts, upgrades, repairs, etc. on sites like svrider.com and vfrworld.com. Without sites like these, when I had a problem with &#8230; <a href="http://drjasondavis.com/2012/01/30/why-ad-networks-should-optimize-for-precision-and-not-recall/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=drjasondavis.com&#038;blog=18274169&#038;post=33&#038;subd=drjasondavis&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>When I was working on my PhD living in Austin, I owned several motorcycles, and spent lots of time online researching parts, upgrades, repairs, etc. on sites like <a href="http://svrider.com/">svrider.com</a> and <a href="http://vfrworld.com">vfrworld.com</a>. Without sites like these, when I had a problem with my motorcycle, I would have had to read the shop manual, go to the parts store, talk to a mechanic, call friends to ask for help, etc. I still did these things on occasion, but online resources made information more immediately accessible, and made my research much more efficient. This sort of information availability is one of the defining disruptions of the web.And not surprisingly, deep content is really my favorite “part” of the web. But internet ads, especially those on many of the sites I frequent, just don’t get my attention.</p>
<p>Many ad networks today claim to have awesome semantic targeting technology that can develop complex models of interpreting content in order to place the most relevant ad. But if a forum post is discussing steel brake lines for a motrcycle, and the ad network only has a generic ad for an auto parts store, then the placement can only be so relevant, regardless of technology. Some of the best content on the web is quite deep, but most ad inventory generally lacks required specificity. The reason why ad networks today aren’t able to get my attention has nothing to do with their technology. It has 100% due to lack of inventory.</p>
<p>If I’m reading content about motorcycle brakes, an effective ad should directly address the subject of motorcycle brakes. Perhaps a book on motorcycle brake repair from Amazon, or brake lines for my Suzuki sold through an online merchant.  Semantic targeting technology may be able to make inferences like “Suzuki is a type of motorcycle manufacturer”, and then show a banner ad to buy a new Yamaha bike (with a low APR!). But I can only see so many of these without turning a blind eye, and most of the time I’m not in the market for these sorts of offers anyway.</p>
<p>Highly specific ad inventory has greater potential for high precision: an ad for a book on motorcycle brake maintenance won’t be relevant to every page on a motorcycle site, but it will be highly relevant to some pages. Generic ads have higher potential for recall: an ad for a new Yamaha motorcycle has relevance to potentially any page on a site about motorcycles.  Of course, my argument here is that this sort of relevance just isn’t good enough to capture my attention.</p>
<p>My previous startup focused on this problem directly. We focused on the retail sector, and created thousands or even millions of ads for each advertiser. We created these ads automatically via a content extraction algorithm that was able to recognize and parse products from the web. This problem of automatic ad creation was actually much harder than the ad matching itself. Automatic ad creation enabled us to create a database of tens of millions of products. Our ad targeting was very precise, but not because of any sort of next generation semantic technology. We used well-tuned keyword matching algorithms and leveraged the scale and diversity of our ad database.</p>
<p>In the end, Adtuitive never reached scale to disrupt the ad industry. We were presented with an awesome opportunity to apply ourselves and our technology to work on problems at Etsy, where I now lead an awesome team of engineers and data scientists working on search, advertising, and personalization.</p>
<p>However, I still see huge opportunities here in this space.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/drjasondavis.wordpress.com/33/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/drjasondavis.wordpress.com/33/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=drjasondavis.com&#038;blog=18274169&#038;post=33&#038;subd=drjasondavis&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://drjasondavis.com/2012/01/30/why-ad-networks-should-optimize-for-precision-and-not-recall/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/5ad800529ca815561e874237f707c810?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">jasonvdavis</media:title>
		</media:content>
	</item>
		<item>
		<title>Recall &amp; Precision</title>
		<link>http://drjasondavis.com/2012/01/21/recall-precision/</link>
		<comments>http://drjasondavis.com/2012/01/21/recall-precision/#comments</comments>
		<pubDate>Sat, 21 Jan 2012 16:14:22 +0000</pubDate>
		<dc:creator>Jason Davis</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://drjasondavis.wordpress.com/?p=11</guid>
		<description><![CDATA[Precision and recall are two fundamental quality measures in search and information retrieval applications.  Google is fundamentally a search application. But Google doesn&#8217;t need to optimize for recall, it just optimizes for precision.  When I Google for “Michael Jordan”, I’m &#8230; <a href="http://drjasondavis.com/2012/01/21/recall-precision/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=drjasondavis.com&#038;blog=18274169&#038;post=11&#038;subd=drjasondavis&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a href="http://en.wikipedia.org/wiki/Precision_and_recall">Precision and recall</a> are two fundamental quality measures in search and information retrieval applications.  Google is fundamentally a search application. But Google doesn&#8217;t need to optimize for recall, it just optimizes for precision.  When I Google for “Michael Jordan”, I’m really just looking for a single page about the basketball legend. A search application’s ability to find *all* pages about “Michael Jordan” is a measure of recall, and users don&#8217;t want to read thousands of pages about Mike, so Google doesn&#8217;t optimize for this.</p>
<p>Optimizing for recall is hard, and I spent most of my academic life working on algorithms to improve this measure. At an intuitive level, these algorithms &#8220;discover&#8221; relational inferences between various entities. For example, basketball, court, and rim are related, whereas field goal, penalty shot, and slam dunk are not. At a technical level, these algorithms worked by learning a distance measure between objects. These distance metric learning algorithms did some very heavy lifting to show improvements in quality, but their improvements really only increased recall.</p>
<p>I found the math behind these algorithms to be very interesting.  At the core of these algorithms was optimizing a class of distance functions called Bregman divergences.  The simplest Bregman divergence which you may be familiar with is the Euclidean distance in two dimensional space.  This distance can be generalized beyond two multi-dimensional spaces, and Bregman divergences are a class of distance measures that go beyond just standard sum of squares Euclidean distances.  In fact, Bregman divergences can be further generalized between non-vector inputs, and the core distance measure used by these metric learning algorithms is in fact a distance between matrices. Without getting into details, this distance function between matrices is used to capture the relationship between pairs of features (words, phrases, etc.) in the problem domain.</p>
<p>The supervision aspect of this metric learning problem came in the form of pairs of constraints: two points (documents, web pages, etc.) can be constrained to be similar, or points can be constrained dissimilar.<br />
<a href="http://drjasondavis.files.wordpress.com/2011/12/usps-classes.png"><img class="alignnone size-medium wp-image-12" title="usps classes" src="http://drjasondavis.files.wordpress.com/2011/12/usps-classes.png?w=300&#038;h=237" alt="" width="300" height="237" /></a><br />
Each point in the image above represents a handwritten digit (0-9).  Here, images of the same digit would be constrained to be similar, whereas images of different digits would be constrained dissimilar.</p>
<p>These algorithms did some very heavy lifting to improve problem recall.  And I did some very heavy lifting of my own to develop these algorithms. This problem is extremely hard in general, and I don’t work on these sorts of problems any more.</p>
<p>So, why is optimizing for precision easy? A search like “east village 13th street squatters” is really best served by a handful of highly relevant pages on the subject matter (for example, this page on <a href="http://aphdigital.org/GVH/exhibits/show/eastvillageprotest/squats">east village squatting</a>).  Google is able to find these results because it crawls and indexes billions and billions of pages from the web.  If Google instead indexed say only a few tens of millions of pages, their algorithmic challenges would be much greater: they’d have to make much deeper inferences about pages that are less topical to my query.  For example, <a href="http://en.wikipedia.org/wiki/Squatting">Wikipedia’s page on squatting</a> makes reference to New York City, but it may ultimately refer me to a book that I could only find in a library.</p>
<p>Google of course has developed world class algorithms and has in many ways perfected web search.  But Google optimizes for precision, and optimizing precision is easier than optimizing recall.</p>
<p>And I&#8217;ve found that many real world problems require optimizing precision over recall.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/drjasondavis.wordpress.com/11/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/drjasondavis.wordpress.com/11/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=drjasondavis.com&#038;blog=18274169&#038;post=11&#038;subd=drjasondavis&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://drjasondavis.com/2012/01/21/recall-precision/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/5ad800529ca815561e874237f707c810?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">jasonvdavis</media:title>
		</media:content>

		<media:content url="http://drjasondavis.files.wordpress.com/2011/12/usps-classes.png?w=300" medium="image">
			<media:title type="html">usps classes</media:title>
		</media:content>
	</item>
	</channel>
</rss>
