Friday, May 28, 2010

Banking and Technology

Even though I don't work in the Financial Services Industry anymore, I still keep up with many of those that do and usually find it an interesting area to read up on and discuss. This week, Ron Shevlin posted "Banking is Not a Technology Business" where he discusses "technology companies" like Mint and BankSimple.

I read the article, chewed on it for a while, and finally decided to comment. And then, when I actually started typing, my comment got freakishly long. Since it was turning into a blog post, I thought I make it one...

(For context, it's probably valuable to read both the article and the comments of Ron's article.)

----

I don't have this 100% pulled together, but I'll give it a shot so I can get to the long weekend without this on my mind...

As technology becomes a bigger part of the advancements that almost every industry is undertaking, I do believe that it opens opportunities for "technology companies" to invade the space.

When I look at Apple in media distribution (and soon to be in advertising), Amazon in "everything" distribution, Google in the Advertising space I see companies that first and foremost are technology companies and have found ways to use their technology to make money in areas that lie outside of traditional hardware and paid software models. These technologies have impacted how these services are consumed and created in a big way.

So companies are coming along that say "Look. The hardest part of industry/exchange/service is getting the technology right. I can be good at that and figure out the rest."

Bankers have taken the exact opposite approach. I know banking...I'll outsource the technology...and that may be working...but I hear more than a few complaints about the strangle hold core vendors have on banks...so maybe it's not.

I'm starting to think that as services become commoditized they become ripe for technology companies to not only invade the space, but become leaders and game changers. I hear many that say banking is moving towards (if not is already at) a commoditized level, so I think it's a very possible candidate. (However I do think that Bankings regulations help protect it from an all out invasion.)

Now as far as being clueless, not making money etc...I couldn't agree more. That said, being clueless and naive are different...and sometimes it takes naivety to try something that shouldn't work but somehow does.

At least I hope so...I'd hate to think I have to wait for 15-20 more years of experience in my field to do anything of significance in it...much less have an opportunity to extend what I know to other areas of industry. ;)

Tuesday, March 9, 2010

Wednesday is the Worst Day for Email

Last week after my Twitter Predictive dump, I ended up chatting with a few people about the whole thing. I'm super glad to be able to start having these conversations even if it only helps me realize how little I know about what I'm trying to do :)

One of the conversations that came up was talk about regression analysis and email trends.

Still Learning

First it's worth noting that the Markov chain I've been toying with has very close ties with Bayesian inference model which is in fact are a type of regression...at least according to Wikipedia. Yes, snicker if you must, but I'm catching up on a lot of my Math on Wikipedia while I try and locate the right resources for what I'm attempting. (I did get a Math minor at A&M, but unfortunately none of the courses I remember covered this type of modeling and it's far outside the realm of any books I kept.) I did locate some courses on iTunesU that I'll checkout this week, but if you have any resources, let's have'em.

No Email Wednesdays

Now when it comes to email, apparently Wednesday is the worst day to send it. It doesn't get opened on Wednesday. On top of that each day has it's own twists and turns that correlate to open rates.

Now one way to look at this that each day should be part of any predictive model having to do with email. (And while we're at it twitter...or any other communication based activity.) We can help determine what tomorrow's activity will look like by factoring in what day tomorrow will be.

In this case, the day of the week becomes one of many environmental variables to include in the model. I'm not against this, and I'm sure a mature model does include many environment variables to help predictions. (Think weather forecasting again.)

However, what I'm most interested in right now are behavioral variables. What are the actions I can look at today to determine tomorrow? It may not be that much of a difference but it is in my head.

In fact, in my head, by monitoring series of behaviors, we can actually prove a weekly pattern. Instead of Wednesday being the worst day for email because it's Wednesday, it's just part of an Active, Active, Dormant, Active.... behavioral pattern. (I guess after seeing this over and over, perhaps a behavioral pattern becomes so accepted that it becomes an environmental variable?)

Like I said, there may be no difference there at all, but I thought I'd share my thoughts on it anyways. Keep'em coming.

Sunday, February 28, 2010

Twitter Forecasting: How likely is it that I'll tweet tomorrow?

For a while now I've been swirling around the notion of forecasting (or simulating) the activities of members in social media communities, especially Twitter. (Think of the weather. Now think of hurricane predictions. Now replace the hurricane with a twitter member and replace land with different twitter activities. Now you're thinking like I'm thinking. I'm on a horse.)

In my quick glances across the web, I have seen a bit about social forecasting in general (and some really really smart people talking about it), and I've seen lots of forecasts in regard to the industry of social media, but not really anything addressing the forecasting of specific activities within a specific community.

Trending is something that the industry is starting to get a hold of, but what I'm talking about goes beyond trending. Trending tells us what has happened. I'm interested in what will (or at least is likely to) happen. In my mind forecasting presents some exciting possibilities in being able to help everyone involved in social media communities from the community managers to the system admins.

For community managers, the benefit lies in being able to work off of indicators that signal the rise of a community star or the trailing off of one. Whole simulations could be run based on activity in a member's first 24 hours.

For systems admins it could help in identifying spikes before they happen or identifying the best days (or hours) to perform maintenance and updates.

Maybe I'm overstating the possibilities, but I think it's big and it's exciting stuff to me. Let's get started.

How likely is it that I'll tweet tomorrow?

It's a simple question but one that has lots of implications. It can help a community manager know whether I'm engaged or slipping away. It can help a systems admin know if I'll be requiring any computing resources.

If we were to start by saying: "What's the probability I will tweet on any given day?" we'd find out that over the last 6 months I've tweeted on 57% of the days, just a little over half of the days. But our question isn't about "any given day" it's about tomorrow.

We know "tomorrow" is one day in a series of my twitter engagement. So what if we knew whether or not I had already tweeted today? Would that make a difference in our probability? Am I more likely to tweet tomorrow if I've tweeted today? Less likely?

Math modeling (no...not that kind of modeling)

What if we could develop a simple model that helps us predict my future tweets by my given state? Turns out, we can...or at least we can try. In this case the key is to focus on the probability of transitioning from one state to the next: Given that I tweeted today, what's the likelihood that I tweet tomorrow? What about for the next 3 days?

So that's what I did. I choose a Markov model (which I've looked at before, but never used...so please...if you see this and are some kind of Markov model expert, please let me know how I did) and starting working away at trying to come up with one that was more than just throwing numbers up in the air.

Now...my model is currently based on a very small sample size (just my twitter activity), and therefore has a very high margin for error, but it's a start. (There are obvious problems with the "super" predictions due to my small sample.)

Here's what I have:


          dor  act  sup
dormant [ 0.46 0.54 0.00 ]
active  [ 0.40 0.59 0.01 ]
super   [ 0.00 1.00 0.00 ]


I've created three states called "dormant", "active", and "super". A dormant user will not tweet in a given day. An active user will tweet 1 to 9 times. A super user will tweet 10 or more times in a day. The matrix represents the probability of moving from one state to the next.

For example, if I tweet today, there is actually only a 40% that I won't tweet tomorrow. (In contrast, there is a 60% chance I tweet at least once.) If I'm dormant today, there's actually only a 54% that I tweet tomorrow.

Let's look at this again. Our simple average says there's a 57% chance I tweet tomorrow, but our model says there could be as high as a 60% chance and as low as a 54% chance. While those all sound close together, when you are talking thousands or millions of users, those small percentage points can mean a lot.

More to come...

To me this is only the beginning. Here's what I'm looking at for next steps:

1) A model based of off all the people I follow on twitter. It gives me a larger sample size (that's not ginormous) and could be really cool to take a gander at.

2) A functional programming model. I did all of my initial fetching and computation using Ruby which is my hammer in this programming world. However, once of the sparks that got me to actually start on this project was the opportunity have something concrete to apply a functional language to. I believe the crunching involved to predict large datasets would benefit greatly from a functional design and given their current hotness I'd love to let one loose on this thing.

3) A model based of off a large twitter sampling. Lots of users. Lots of crunching.

4) A web service? I'm still working on the best way to practically expose this data and work with it. I've got lots of ideas but none that are shining right now...

5) Other models. I'd love to try some other models and test their effectiveness...

That'll do it for now. I'd love to hear your feedback on this.

----

Also if you're interested I've posted the code I wrote this weekend up in a gist. It was two files, but I've combined it for sharing. I did it in a style I'm going to call narrative programming, where the main concern is not reuse or architecture, but instead telling a story and following a train of thought from beginning to end.

Saturday, December 5, 2009

On to the Next Adventure

Monday morning bright and semi-early I start my next professional adventure. I'll be joining Sabre Holding Company. Sabre is a travel technology company that has as much past (developing the first electronic reservation system) as it does present (travelocity) and future (checkout tripcase) and I'm excited about the opportunities there.

I'll be joining the Travel Studios group which is tasked with experimenting with emerging technologies and products. Specifically I'll be working on Cubeless, an enterprise social network. I'm tellin' ya...these social network thingies...they're gonna catch on. ;)

I'm really excited about joining this organization, group, and product team. Thanks to everyone that's been supportive during the transition and I look forward to the things to come as a Sabre-ite...Sabreon...Sabreian...Sabrenian... ;)

Thursday, November 19, 2009

Change

There's no easy way to talk about this without just saying it...Friday, December 4 will be my last day at The Garland Group. I've decided to start on a different adventure that I'll talk about more as it gets closer.

Before I move on, I do want to expend just a few 1s and 0s on just how much I have enjoyed working at The Garland Group. Honestly, if I were to really get into every inch I've enjoyed, you'd probably think this "Garland Group" place doesn't really exist...it's just a fairy tale I've made up to make other work environments look silly.

You probably wouldn't believe that you can work next to someone that's in a different state or complete projects without ever formalizing work hours. You'd never imagine that encouraging people to go to the grocery store in the middle of a Tuesday could lead to results above expectations. You'd think there is no way a small company headquartered in the Dallas suburbs could change the way an industry thinks and talks about Compliance and that Security could involve more Collaboration than combinations.

Yes. I suppose it does sound a little far fetched, but it's been my reality. It's a tough place to say goodbye to and a hard group to part ways with. I have very much enjoyed my time at The Garland Group and though I am very excited about what is on the horizon, I'd be kidding myself if I said I won't miss it.

Wednesday, October 28, 2009

Abstract Analyzer

Here's a project I've been plugging away at:



(Big version here, or use the tiny full screen button in the bottom right of the movie)


Gem: http://gemcutter.org/gems/abstract_analyzer
Source: http://github.com/markmcspadden/abstract-analyzer

Saturday, October 24, 2009

Rolling with Raindrop

Today Mozilla announced a message aggregation project they've been working on called Raindrop. Now, I find myself slower and slower to fall into the hype on these type of things, but after downloading the source and getting it up and running on my machine, I have to say it looks pretty cool.

The install took a couple of hours, mostly due to my non-existent python chops and having to download and install Mercurial. But after a while I was up and running and pulling down my twitter feeds and emails into the same location. Way cool.

Spelunking

But what kind of nerd would I be if I just got it running and stopped at that? ;)

In addition with directions on how to setup twitter and gmail aggregation, the default install comes with a single RSS feed baked in. But who can do with just one RSS feed right? So I set out to add another.

It proved pretty difficult just to find where the initial feed was set (Textmate Ack in Project failed me) so I hopped into the Raindrop chat room where Mark Hammond from the Raindrop team pointed me to the correct directory. (Mark also said he didn't know if anyone had even tried this yet, but I wasn't going to let that little detail stop me.)

With his help I soon had 2 RSS feeds dumping into my Raindrop. Cool. Sure they were being dumped into a single box that had the wrong heading, but it was a start.

The major thing that bugged me was that they are streaming in without links. Boooo. Feeds need links. So I start hacking at the JS and HTML implementation of those messages and before long I had an external link for each headline.

Missing the Point

I felt pretty accomplished, but I sat back and realized that I had missed the boat on the whole point of Raindrop. It's not just to aggregate data and then push you off to the outside world, from what I understand it's about being able to interact with all types of content from a single location. The external links had to go.

A quick look at the twitter implementation revealed what I needed, a link to couchdb doc that actually holds each story. The Raindrop UI is setup to handle the display of these so after digging in and finding out where to get the document id I was in business. I could view entries from multiple feeds, open them within the Raindrop application, and even archive them. (I have no idea where they go when they are archived, I just know they leave the front page.)

So all that work for this little diff:

diff -r e07f7793ad1b client/lib/rdw/story/templates/GenericGroupMessage.html
--- a/client/lib/rdw/story/templates/GenericGroupMessage.html  Fri Oct 23 15:10:01 2009 +1100
+++ b/client/lib/rdw/story/templates/GenericGroupMessage.html  Sat Oct 24 02:23:34 2009 -0500
@@ -4,7 +4,7 @@
   </div>
   <div class="message">
     <div class="content">
-      <span class="subject">${subject}</span>
+      <span class="subject">${subject} <a href="#${expandLink}" class="expand" title="${i18n.expand}">${i18n.expand}</a></span>
     </div>
   </div>
 </div>


It doesn't seem like much, but it represents several hours of education and paradigm examination and I'm proud of it.

Now, off to bed...