ŷhat

  • How Yhat Does Cloud Balancing: A Case Study

    If you're reading this, you're probably aware that we at Yhat offer a public sandbox where anyone can try out our ScienceOps product without charge. Giving potential customers the chance to try out and evaluate our products is an important tool for generating new business. What may not be obvious are the logistical challenges behind such an offering. This post will address some of those challenges, along with operational modeling techniques we use to solve them. Specifically, we ...

    Nov 10 2014
    by Ryan J. O'Neil
  • Introducing db.py

    Data analysis libraries for Python keep getting better and better. pandas is now on 0.15, scikit-learn continues to pick up converts, and a number of visualization libraries have started to emerge: for example, our own baby, ggplot, and others like seaborn. But there's still one subject that's practically synonymous with data analysis where there aren't any new, killer libraries: databases. There are some great Python database libraries (SQLAlchemy instantly comes to mind), but none are focused ...

    Nov 05 2014
    by Greg Lamp
  • Using data science to build better products

    Data science as a field of study is growing at an epic pace. There are competitions to build the best predictive algorithms, tons of data blogs/tutorials, and a number of fast-growing (and hugely successful) professional education platforms for teaching data science skills (Insight Data Science, Zipfian Academy, General Assembly, Coursera, Udacity). Data science has even taken a seat at the big kids table, with many of the most prestigious colleges and universities now offering undergrad and graduate level degrees ...

    Sep 17 2014
    by Colin Ristig
  • Analysing your e-commerce funnel with R

    This post is by Justin Marciszewski, Founding Partner at Harbor Island Analytics, an analytics consultancy specializing in e-commerce, digital marketing, and user behavior strategy. Harbor Island helps clients use data to identify new opportunities, reach audiences more effectively, and make stickier apps. Find Justin on LinkedIn, Twitter, or Github, or reach him by email at justin [at] harborislandanalytics [dot] com. Intro Optimizing on-site or in-app sales is one of, if not the most, common problems in online retail. I thought ...

    Aug 05 2014
    by Justin Marciszewski
  • Fuzzy Matching with Yhat

    The Problem Ever had to manually comb through a database looking for duplicates? Anyone that's ever had a data entry job probably knows what I'm talking about. It's not fun! In this post I'm going to show you how you can write a simple, yet effective algorithm for finding duplicates in your data. Some Example Data Our example data consists of 500 records, each containing an id, 2 names, and 2 addresses. The address only consists ...

    Jul 23 2014
    by Greg
  • Yhat Sciencebox

    We're really excited to announce the release of our newest product, Yhat Sciencebox! In working on data science projects, we've often found that spinning up new servers, ssh-ing into boxes, and waiting for lengthy installs of Python and R libraries can be a real pain. For people new to data science or data scientists who simply aren't as technically oriented, offloading computationally intensive jobs to external servers can be a daunting task. Enter Sciencebox Sciencebox is a ...

    Jun 17 2014
    by Colin Ristig
  • Python Sparse Random Projections

    This guest post is brought to you by Adrian Rosebrock, PhD, a computer vision expert and creator of pyimagesearch.com, a blog about computer vision and image search. Adrian is an entrepreneur and creator of ID My Pill, an app for identifying prescription pills using your smartphone’s camera, as well as Chic Engine, a fashion search engine that lets you search clothes with pics. You can find Adrian on LinkedIn, Twitter, and GitHub. Intro There are many data related ...

    Jun 05 2014
    by Adrian Rosebrock
  • Yhat meets Go

    If you are a regular reader here, this post is going to be a bit of a different format than what you are used to. But I promise it will be just as interesting and will include a few Top Gun references. Welcome to the first Yhat Engineering blog post! Changes, Tupac loved them we do too. Recently we made some large changes to the architecture that makes up our software. The main component of which was recently rewritten in ...

    May 29 2014
    by Jess Frazelle

Post Index


How Yhat Does Cloud Balancing: A Case Study

Ryan J. O'Neil | Nov 10, 2014


Introducing db.py

Greg Lamp | Nov 05, 2014


Using data science to build better products

Colin Ristig | Sep 17, 2014


Analysing your e-commerce funnel with R

Justin Marciszewski | Aug 05, 2014


Fuzzy Matching with Yhat

Greg | Jul 23, 2014


Yhat Sciencebox

Colin Ristig | Jun 17, 2014


Python Sparse Random Projections

Adrian Rosebrock | Jun 05, 2014


Yhat meets Go

Jess Frazelle | May 29, 2014


Neural networks and a dive into Julia

Eric Chiang | May 15, 2014


ggplot tutorial

Greg | May 02, 2014


Python Multi-armed Bandits (and Beer!)

Eric Chiang | Apr 07, 2014


Predicting customer churn with scikit-learn

Eric Chiang | Mar 20, 2014


Real-time NLP with Twitter and Yhat

Greg | Mar 14, 2014


Yhat at NY Enterprise Technology Meetup

Greg | Mar 11, 2014


Yhat at the SF Data Science Meetup

Greg | Feb 17, 2014


Image Processing with scikit-image

Eric Chiang | Jan 30, 2014


What's new in ggplot-0.4?

Yhat | Jan 22, 2014


Data Science in Python

Greg | Jan 13, 2014


Detecting Outlier Car Prices on the Web

Josh Levy | Dec 18, 2013


Weather Forecasting with Twitter & Pandas

Eric Chiang | Dec 05, 2013


Building email reports with R

yhat | Nov 22, 2013


Aggregating & plotting time series in python

yhat | Nov 03, 2013


ggplot for python

Yhat | Oct 13, 2013


Random Forest Regression and Classification in R and Python

yhat | Sep 29, 2013


Fast summary statistics in R with data.table

Jeff | Sep 26, 2013


Two great things that go great together: Yhat and fantasy football

Drew Conway | Aug 25, 2013


Estimating User Lifetimes - the right and many wrong ways

Cam Davidson-Pilon | Aug 20, 2013


Machine Learning for Predicting Bad Loans

yhat | Aug 16, 2013


10 Books for Data Enthusiasts

yhat | Aug 11, 2013


PyData Boston 2013 Slides

yhat | Jul 29, 2013


Intuitive Classification using KNN and Python

yhat | Jul 25, 2013


Recognizing Handwritten Digits in Python

yhat | Jul 14, 2013


Named Entities in Law & Order Episodes

yhat | Jul 04, 2013


Running R in the Cloud (Part 1)

yhat | Jun 27, 2013


Statistical Quality Control in R

yhat | Jun 25, 2013


Recommendation System in R

yhat | Jun 19, 2013


Content-based image classification in Python

yhat | Jun 12, 2013


Random Forests in Python

yhat | Jun 05, 2013


Fitting & Interpreting Linear Models in R

yhat | May 18, 2013


Deploy Your R Models to yhat

yhat | May 10, 2013


pandas & google analytics

yhat | Apr 12, 2013


7 handy SQL features for data scientists

yhat | Apr 09, 2013


yhat is going to PyCon

yhat | Mar 10, 2013


Logistic Regression in Python

yhat | Mar 03, 2013


SQL for pandas DataFrames

yhat | Feb 24, 2013


R and pandas and what I've learned about each

yhat | Feb 16, 2013


Setting Up Scientific Python

yhat | Feb 15, 2013


10 R packages I wish I knew about earlier

yhat | Feb 10, 2013


Predicting SMS spam

yhat | Jan 08, 2013


Repeatable, Scalable, Analytics using yhat

yhat | Jan 05, 2013