April 27, 2018
Ep. #28, Front-End Infrastructure at Coursera
In episode 28 of JAMstack Radio, Brian is joined by Jon Wong, Senior Software Engineer at Coursera. Jon discusses how Coursera has approache...
In the latest episode of JAMstack Radio, Brian invites Vincent Voyer and Emily Hayman to discuss Algolia, a hosted search as a service.
Emily reveals that Algolia’s key differentiator is performance, both in terms of indexing and querying, as well as configurability. They also discuss Instant Search, an easy to use UI library for building a great search interface.
About the Guests
Emily Hayman is a Solutions Engineer at Algolia. Emily is passionate about well-architected Sass, the possibilities of SVGs, modularity, fine-tuning animation performance, and tinkering with data visualizations.
Vincent Voyer is a Software Engineer at Algolia. Vincent is a big fan of small modules, unit and functional testing. Prior to Algolia, Vincent worked as a freelance web performance specialist at Zeroload.
Brian Douglas: So welcome to another installment of JAMstack Radio. In the studio, we have Ryan Neal.
Ryan Neal: How's it going?
Brian: And then from Algolia we have Vincent.
Vincent Voyer: Hi.
Brian: And also Emily.
Emily Hayman: Hi.
Brian: Cool, Emily, why don't you go first, since you're on my left. Do you want to explain what you do at Algolia?
Emily: Sure I'm a solutions engineer at Algolia. And what that means is that I work directly with clients and help them out on their implementations. So that can mean answering technical questions or providing user experience guidance.
Brian: Cool and then Vincent?
Brian: Cool how big is Algolia at the moment, as far team wise?
Emily: So right now I think we're up to about 80 people. Which is incredible because when I started six months ago, we were 55. So growing really quickly which is awesome.
Brian: Okay that's a good sign.
Vincent: Yeah I started two years ago and we were 10 people. So it's impressive.
Brian: Nice yeah because I can't say I knew what Algolia was even six months ago. So do one of you want to take the honors and explain exactly what Algolia is?
Emily: Sure yeah so from a super high level
Algolia makes it really easy to create a kick-ass search experience out of the box.
So we're a hosted search as a service, and we focus primarily on performance, so thats speed, as well as relevancy. So what this means in terms of process, is you index your data with us and then you configure relevancy, using one of our API clients or via the dashboard we provide. And then on the front end, you can pretty much just query us to build any experience you can imagine.
Brian: Very cool and then wasn't the catch phrase, milliseconds matter? Is that you guys?
Ryan: I love that thing.
Brian: Is that you guys?
Emily: Yep. Performance is so crucial to what we do.
Our secret sauce is that we handle a lot of the work at indexing time, instead of at query time.
And that means that when you actually query us, it's going to be super super fast.
Ryan: I've seen that when people want to make a query stuff, they start on Elasticsearch, cluster behind something and hope for the best. I assume that you guys aren't doing that right?
Emily: Definitely not. So the idea is that you index your data with us and you configure a lot of settings related to that, as part of that indexing time. So when you query us you can definitely include many parameters. But again the bulk of that work is happening beforehand. And that means that the latency is going to be very small.
Brian: So you used the word index a couple of times. What is index mean? Do I upload a zip file or how does that work?
Emily: So yeah you push JSON to us.
Emily: And basically semi structured data is what we're looking for. Our bread and butter is searching within that and then any sort of relevance and configurations you make are related to the attributes that you send us. So in the Algolia world, JSON objects are called records and they're put into indices. So when you index your data, you're basically sending us a bunch of JSON that you can then configure.
Brian: Okay. So you mention Elasticsearch Ryan. And when I tried to basically to Google fou to figure out what else is out there as far as search as a service, Elasticsearch is the only thing that came up. Are there anybody else doing this besides Elasticsearch? Even though it's not the same.
Emily: Yeah so there are a few others. There's Swiftype, Lucene, Solr, Elastic, as you mentioned. From our perspective, we're the best. Again, our differentiators are performance. We're going to be the fastest, both in terms of indexing and query. As well as configurability. We think we're much easier to use out of the box. We're simpler, it's more straight forward. Our relevance is just going to be better.
Brian: Okay. So Vincent and we can talk about some of the projects you've been working on with other companies, like Instant Search, is that what it is?
Vincent: Instant Search. So basically at Algolia we have two main patterns. We have auto complete menus, which are on top of your page, like on Google instant. And then we have Instant Search, which is a full page search interface. And what we discovered is that when people wanted to build the product with Algolia, they needed to use a lot of different tools. Like for example, if you wanted to build a complete search interface, you needed the API clients and then jQuery and the list goes on.
And so we created tutorials for all of that to plug it together and people manage to do a search interface with it. But at some point they will struggle, or maybe people that we are not complete developers will struggle. So what we wanted to do is to have a single library, like a jQuery UI. I don't know if you guys know jQuery UI.
Vincent: So it's really simple to do widgets with jQuery UI.
Vincent: So we wanted to do the same.
We wanted to reverse the thinking from "how do I use Algolia to build the search, to how do I build my search with Algolia?"
Think about the search page first. So when you're thinking about the search page, you use widgets. So
Instant Search is a UI library to build a good search interface.
So you just plug different widgets. For example, list of categories, or maybe a menu, a slider, a search box, a page and then we do the out planning, constricting the html page and then we do it for you. So it's kind of a framework.
Brian: Okay from the outside looking in, I haven't used Algolia yet for any projects yet. Do you guys cater more to the client side, the front end developers? With things like Instant Search?
Vincent: Do we target developers?
Vincent: Yeah of course because we have a lot of API clients. When you use Algolia for example, you can definitely use Ruby client to do back end searches and then send the content to the user. But the true speed of Algolia come from the fact that we want the actual users of your website to target them from their border. We don't want to have a proxy.
Ryan: Okay. So if you're coming off a front end, one of the reasons I would build a proxy is for authentication. How do you guys handle the authentication story?
Emily: So there's a few options there. In terms of the API key that we recommend you use, it's going to be a search only key. So you can just query with it. There's other restrictions you can place on it. You can rate limit it, you can research it by IP. Even beyond that we offer something called, secured API keys. Basically on your back end you generate a hash, that includes certain query parameters and that will often be a filter that will allow you to filter down to a subset of your data.
Ryan: Okay so I would generate potentially two or three keys, my back end would be importing them, my front end would have that somewhere easily accessible, and then I don't have to proxy that.
Emily: Exactly. And then again you don't have to deal with the latency issues of having a back end implementation.
Ryan: Yeah I don't even have to keep a back end up if I'm doing that.
Vincent: And also
If you don't have a back end implementation, you remove a single point of failure.
because inside dual API clients, every time you index your data in Algolia, it's replicated on three servers. And inside the front end tuning we are giving you, it will target one of the three servers and will do a fall back on it. So there is not a single point of failure as it can get out of your way.
Brian: Wow, that's like right with the jam philosophy right there.
Ryan: Deploy your front end. Please don't have servers in the back end. Let other people handle the search part.
Brian: Yeah it's pretty cool. I'd mention a couple of days ago about the way Netlify does our search on the netlify.com. So I didn't actually talk about the detail. I actually didn't get Eli to come on today to talk about it because he actually implemented it. But what we're doing is we're basically indexing all of our pages using Gulp. So we scrape all the data we have in some sort of store and then we search with Winnerges, which is hit or miss most of the time. For the most part it works. But it sounds like we might have over architected our search on our home page, talking with you guys.
Ryan: Yeah it's usually easier to make somebody else do it. It's much better because they do it way better than what you can think of in an afternoon.
Brian: So I originally saw tweets from you Vincent about different things like hacker news clone. I think you guys are doing the Yarn implementation for search. Do you want to talk a little bit about that?
Vincent: Yeah so the Yarn implementation is a good one. When I joined Algolia, basically what I did is proof of concept searching for NPM packages. And that's when I discovered that it was on some products. And then we proposed that to NPM, but it was one year ago and they were not really prepared for that.
They had a lot of things to do. So we didn't get matched. And then Yarn came and nowadays we are using Yarn in Algolia because it has solved a good number of issues for us. And at the end of last year, we wanted to do a Christmas gift to the community.
Vincent: So that's what we did and we wanted to provide Yarn search for packages on the Yarn website. So we did the same, a pull request. In that time we warned the Yarn core team before and told them that we are going to do that. "Are you okay?" And they said "yeah we are okay." We did it and it turned out well. It was merged on the 31 of December, 2016. So nice for News Years Eve. And then today we are like the owner of the search of the of the Yarn community. So on the website and maybe soon on the client also, on the Yarn common name client.
Vincent: So we have someone full time doing it. We met people from Yarn in London and we are going to iterate on this. So yeah that's a nice project.
Brian: Wow I'm looking forward to next years Christmas present. Yeah so curious, did NPM get back to you after Yarn merged in there's?
Vincent: So NPM and Yarn have a good relationship. It's a nice competition. It's a healthy competition for them. And basically NPM did the same, they had good relations with the website, which was called npms.io. And it was already a search for NPM packages. And they implemented that inside NPM. The NPM guys with Algolia were happy about request. But it was a time for them that was not good. So today they did implement NPM search and it's a good one and we are trying to compete on the speed and the relevancy side between Yarn and NPM. So it's a good learning for us.
Ryan: You've mentioned structured and unstructured data. You guys do really well with structured data. Is there still a story around unstructured data, or is it please just don't do this?
Emily: I mean to be honest, that's not our strong suit. We don't do tf-idf, that's where we actually would point somebody to Elasticsearch. However, you can make structured data out of unstructured data. So in that case, we would ask you take a deeper look at your data, perhaps sort of reconsider how you are architecting it and it can be done in a manner that's going to be useful for Algolia.
Ryan: Because I would use Elasticsearch for unstructured data, but almost none of my data really needed to be unstructured. I was just too lazy to structure it.
Emily: That's when I would be like, but "please we can do better".
Ryan: "We can do this for you".
Vincent: We did find some structure from unstructured data. For example we did a project called DocSearch which is able to search into full page documentations. This is something Google does well, because it's unstructured data. But at some point, it has some structured data, and some paragraph belongs to the titles.
So we had a project where we index some full page websites and then transform them to more structured data, which we split into paragraph and titles. And then we provide the documentation search on the website. You have a documentation search which is Algolia. So it's coding the website, and then indexing and providing the search. So we are trying to do something like that.
Ryan: Where you kind of take the unstructured data and transform it into something semi-structured, useful and that's your sweet socks.
Brian: Yeah and so where's the future of search as a service? Do you have anything interesting on the horizon this far with what you guys are working on internally?
Emily: I'm trying to think about what I can speak about.
Vincent: We have things, I just can't tell you.
Brian: It's okay if you can't speak about it, it'll just be a long beep. Over your paragraph of talking.
Emily: Honestly so a lot of what we're working on is providing more advanced options within the engine, more settings, more configuration. In the past six months especially we've had to deal with some really complex use cases and we've learned a lot from that. We're constantly iterating. There's been a lot of cases where one particular customer came to us and said, "hey we have this need".
Most recently, I'm thinking about, we just introduced a new attribute called pagination limited to, which actually lets you paginate through more than what was initially 1000 results. And things like that happen all the time and we iterate very quickly. So hopefully within the next six months we're going to see even more of that.
Brian: Is there a limit of how much you can pass to Algolia, as far as what you can index and search through?
Emily: Theoretically the sky is the limit. But obviously there are constraints. We're constrained by the machine at a certain point. And so our enterprise customers, they have one dedicated cluster and that's actually three machines. But we also offer something called a distributed search network, where we can add extra machines if you need more query performance. Our whole conception is, we scale with you. So even if we start out, "oh I only need one cluster", as you grow we can grow with you.
Brian: Cool and we talked a lot about front end focus, but you mentioned that there was a Ruby library and stuff like that, do you guys do back end searches of service? So if you had some sort of querying on the back end, as far as data goes.
Vincent: Yeah we do have some integrations that are using query on the back end. In terms of back end solution, we have rays, we have Magento. So those for example have some internal code that allows you to automatically synchronize your object from Magento to Algolia. But in the end, the front end will be the way to go as far as the user search. In terms of back end search, I don't know if we have many use cases for that.
Emily: People often use us to generate emails. So they'll do that on the back end because they don't need that sort of very high performance.
Ryan: I remember talking with our CEO about how we would do search on the front end. And we were going to get really ghetto and during the build time just generate this massive JSON blob where a series of JSON blobs let them be searchable. So you just like pull that down and then you just query it in memory. Why would I not do that? For small sites, I mean I'm not going to do IMDB that way. You know, but if I've got 50 pages of docs, that's not a really large JSON blob.
Emily: I mean I think for my perspective, search is our focus. We do search full time, we have a team of 40 people working just on search. If you have any sort of need for configuring relevancy at any level, we could handle that for you. As your little tiny site begins to grow, what happens when you JSON blob becomes a monstrosity? We would make it really easy. That's the thing,
At the end of the day, you can have something up and running with Algolia in a matter of an hour.
Ryan: Oh that's actually really nice. It would take me longer to configure the code to write the JSON blob.
Brian: But Graphcool is a back end as a service. So if you guys aren't initiated. You do have their stickers. Are you guys partnering with any other companies like that? Enhancing search?
Emily: I'm running through all the names in my mind.
Brian: I just realized as I asked you that you guys are both not on the marketing team so maybe, probably not in that conversation.
Emily: Unfortunately I just don't know who I can talk about.
Brian: Oh cool well we'll move on.
Emily: Maybe some of the DocSearch. Actually Stripe released us. And we are powering their documentation search.
Brian: Oh that's great.
Vincent: So yeah we do have some partnerships like that, but it's not like they have a product, then they will use Algolia internally and provide maybe an Algolia widget. An Algolia plug in on their service. We do have some like that. I think we have cloud service provider in France which is clevercloud. They do that so you can automatically create a Algolia index on your servers.
But most of the time partnership are more like community partnerships. So a good very well known website, they want a good search but they don't have the money or the way to do it. So we do it for them and then we provide the search by Algolia on the website. And we do it for them. So DocSearch is one very good project, as it was actually a side project from three engineers that went to Quarto and they decided to do a nice cool project.
And they did that color of documentation web pages and now it's on 200 websites. So most of the time, any good search you will find on the documentation website, it will be DocSearch. Most of Facebook properties they do have it for example. And now we have Stripe.
Brian: Yeah Stripe has some pretty good documentation.
Ryan: I usually use them as a reference plate for my documentation, wishing that it'd be that good. And then I don't write it.
Brian: Yeah one day we'll write good docs. So I threw in the actual run down, a link that I didn't even mention. I just threw it in the bottom of the links. But I saw yesterday, Google site search is I guess waning as a project. Not that you guys were competing head to head on Google site search. But I thought it was interesting and very relevant. I guess not really that relevant. But Google site search is like god awful. It's a horrible product, so I'm glad to see it die.
Emily: We are too believe me.
Ryan: You don't have to have the discussion anymore with customers, please don't use that. And I guess for the listener, Google site search is like search as a service, but Google and it looks like a Google search bar, but not and it searches your site.
Brian: It's pretty bad.
Vincent: Yeah it always look like the big iframe on your website with some CSS.
Brian: Yeah. It's basically iframe on your site that sort of searches your site. So if you guys want to move into that market, it's wide open now. Let your marketing team know.
Emily: Actually we had quite a few leads come out of that from the past maybe month or so. People are panicking. " Oh no what do we do now? We need search". And we're like, "well we can provide that for you".
Brian: I know this company. You might have heard of us.
Ryan: So you said that your enterprise plan are three nodes, how large is your largest cluster end up being, like ballpark?
Emily: So we have two options, there's 64 or 128. And then we're actually looking to scale up from there.
Ryan: Oh those are big clusters.
Emily: Yeah people have a lot of data to search within, believe it or not.
Emily: And often times, those bigger ones are needed by SaaS providers. That's where we see the huge chunks of data coming from.
Ryan: Because they're running a multi-tenant service as well.
Ryan: And so when you're not on enterprise, your other tiers, you guys have like a host of large cluster, I just put my data in and it's fine?
Emily: Exactly yeah so in that case you're going to use something shared. But for the vast majority of use cases, that's perfectly fine.
Vincent: But even if you're not on enterprise, your data is still on three different nodes. So you are almost the same service license settlement.
Ryan: It's definitely going to stay up.
Emily: Oh yeah. There's that 0.00001.
Ryan: There's a chart somewhere the guys watch it and know it.
Brian: Cool well I just wanted to bring up that in six months and five days it's my birthday. So if Algolia wants to provide me a birthday present. Let me know. We'll wrap up the conversation there and move straight in the picks. Picks are jam picks, things we're jamming on, things that get you going. These can be music picks, movies, tech related. I know you guys had some time to think of them. For some reason I can't remember what my pick was. So Ryan do you want to go first?
Ryan: Alright, as I figured out I was going to be on this podcast yesterday, the one that came up actually the other day that I've been really on is I'm learning to spin a fire with a staff. And so tonight turns out is this annual fire jam out on the Embarcadero, that I'm going to go spin fire at for the first time. It'll be kind of awesome and there's going to be like a hundred people throwing fire in the air. And it's going to be really cool.
Ryan: And so I come in tomorrow a little like missing hair on my arms, no eyebrows, don't mock me too much.
Brian: We know it was a successful night.
Ryan: Exactly. Or a complete failure.
Brian: Well if you keep most of the hair on your body, then you know.
Ryan: I'm good.
Ryan: I only whack myself in the face like once or twice every time I spin it.
Vincent: But when did you decide to like, I want to spin fire?
Ryan: A couple of months ago. It's been something I want to do. I've seen a bunch of people do it. And then I was hanging out with a bunch of my friends, one of these guys just starts spinning and I was just like, I'm actually now going to go buy the staff to do this and then started spinning at my house. There's a bunch of marks on my ceiling of me like, "oh the ceiling is not quite high enough to do that."
Emily: See I thought you just wanted to make the fire jam pun.
Ryan: Nope. I didn't even realize that one. I will pretend I did that.
Brian: Well Ryan's a back end developer, so he has to find his excitement elsewhere.
Ryan: I will get pages of scrolling text on a terminal. This is all I do.
Brian: Awesome, so I have have a pick. And it's a company that's been on here before, Serverless. And they have a great product, which is basically, it's a command line tool to access the AWS console. So you can actually do lambda functions from the command line that has added on as of today this morning. And then I think they also have another client, OpenWhisk is the other one as well.
So functions as a service. I am actually building a project and I'm going to be using their cron job feature so I can scrape ESPN data for the upcoming baseball season, because I have a project that I've probably talked about as a later pick once I actually finish it. But until I finish it. Actually I think I've already talked about it in extent in a previous podcast. Anyways it's going to be another pick. But yeah Serverless is my choice. And I will also pick the movie The Arrival. I finally watched it.
Ryan: It's really good.
Brian: I have a small child so I don't watch movies in the movie theater anymore. It's almost impossible.
Ryan: The world thanks you for that.
Brian: Yeah I think I've taken him to one movie his entire life. It was Finding Dory and that's the extent of my movie going. So anyway, finally saw The Arrival. It's great, it's basically like the predecessor to Interstellar. It almost seems like Interstellar. Anyway you should watch the movie if you haven't seen it. I might have been the last person on earth to see it. So that's my pick.
Ryan: Yeah you're a little behind the curve on that one.
Brian: Yeah just slightly behind the curve. Vincent do you have any picks?
Vincent: Yeah I chose a non-tech pick. I'm from Paris, I am 32 years old and lately I have been playing basketball a lot. When I was younger, I was playing a lot of basketball. And then when I came to Paris and started to working in, I felt like I was missing something in terms of sports. So I wanted to play basketball, but you have to find some good people to play with.
And it's not that easy at some point because you are not in the same circuit. You are not playing games on the weekend and stuff. So I just wanted to have one game every week as a weekday. So I tried to recruit people inside Algolia and now we have like every week we are going to play basketball with three or four people and then joining other people. And even when there are some other SF people coming to Paris then I will ask them the first thing after saying, "hello hey to you want to play basketball on Wednesday?"
And then when I came here and I stayed for two weeks, I thought, I will just buy a basketball on Amazon and then we'll go play and we did. Even yesterday evening we did also play basketball. So I'm trying to play basketball every time I can, because it's a very good way like when you wake up the next morning you can do better coding from my point of view.
Vincent: So yeah I really love it. And tonight I will see the Golden State Warriors versus Clippers game.
Ryan: Okay awesome. Yeah I was going to ask if you saw a game yet.
Vincent: Yeah when I come here for two weeks, since I'm really a basketball fan, I have to see an NBA game. And being in SF, it's really good because the Golden State are really really good.
Brian: Yeah they're good so you don't have to watch a horrible team like Kings. Sacramento Kings by the way. They used to be good back in the day. No longer. Sorry I have lots of friends who are Kings fans. So I just take every chance to make a dig at them.
Ryan: We're going to play this for them just on the radio.
Brian: Well I grew up a Magic fan. So Shaquille O'Neal, that was my team growing up, even though it was only for two and a half years he was on the team. But anyway that was my team and they haven't been good since. And it's been very sad watching basketball growing up, until I moved here which is awesome. So Vincent if you disappear in a year to go semi-pro basketball professionally.
Vincent: Yeah maybe.
Brian: Awesome, I'll look forward to it. Remember me when you get really big.
Ryan: His birthday's in six months and five days. Just letting you know.
Brian: And then Emily you have any picks for us?
Emily: Yeah so I really really love CSS. But I'm super excited about some of the new stuff that's coming out. For example, like CSS variables I think are awesome. But one of the cool things I actually found out last week is you can animate the D attribute of an SVG using just CSS. Which kind of blew my mind a little bit. I got really excited about it. So that would be my pick.
Brian: Nice, yeah we're using variables in our new UI actually so Netlify's new UI, we're using variables in there. And then Flexbox is very heavily used.
Emily: I love Flexbox.
Brian: Yeah I don't really understand CSS greatly out yet.
Emily: Yeah I'm still trying to wrap my mind around that one a little bit. But once all the browsers on board, that will be the moment.
Brian: Yeah so cool. Say that again, D grade animations and SVG.
Emily: The D attribute. So the draw attribute. You can actually animate on SVG, so on a path so you can just change the path. You hover over it using just CSS.
Brian: Oh, yeah that sounds like a weekend project. I'm going to have to research that. I've been away from CSS for a couple of months. So I need to catch up.
Ryan: You don't miss it?
Brian: I don't miss it a ton, just because we have good designers that write code at Netlify and they do most of our CSS for us. So I've waned in my experience in CSS since I've been at Netlify. But yeah I can't say I really missed it.
Emily: It's totally fair.
Brian: Awesome so hey guys I thank you for coming here. I was going to say walked all the way six blocks, but you didn't walk. You drove in an Uber.
Vincent: Maybe we'll take the bus back.
Emily: We'll walk uphill both ways.
Brian: Do you guys have any way people want to contact you if they want to find you on the internet and talk more about search?
Vincent: Yeah so my Github account is vvo. And on Twitter, I'm @vvoyer.
Emily: Yeah you can find me on Twitter. I'm @eehayman.
Brian: Awesome well thanks again guys for coming in here and talking about search. Ryan thanks for coming in here last minute.