HubStor’s Geoff Bourgeois and Nittygritty’s Marcus Roberts discuss the data management challenges faced by firms in the Architecture, Engineering, and Construction (AEC) industry.
Full Interview Transcript
[00:00:00] Geoff Bourgeois: Good morning, good afternoon, whatever time zone you’re in. This is Geoff Bourgeois with HubStor, and I’m joined with Marcus Roberts from Nittygritty. He is the director of Nittygritty. We’re sitting down in one of the first editions of the HubStor video blog. Nittygritty started working with HubStor recently and I thought it would be a great idea to have Marcus on, because he’s got great insight into the market. Marcus, can you tell us about Nittygritty and the type of clients that you work with, and how you’re helping them today?
[00:00:40] Marcus Roberts: Hello, Geoff. Thank you. Yes. I work for Nittygritty, we’ve been around for about 13 years, and our focus has always been on the AEC industry. My co-director has got a background in Architecture, so it was a natural pick for us to start providing IT to those industries. At the moment we split the business between, about two thirds is IT support for — primarily architecture in the London area, but also engineers and construction. We have a set of what we call BIM consultants, who go into offices to help people use BIM tools like Revit, and a small part of the company is based around software development.
[00:01:17] Geoff: Okay. Well, that’s great. Just, in case people aren’t familiar with BIM, what does BIM stand for?
[00:01:22] Marcus: BIM is, Building Information Modelling. It’s the move from the two dimensional plans that architecture traditionally used into a 3D model. But more important than the 3D-ness of it is the capture of information about the building. You get a data base about the building, you can query it and say, “Tell me how many doors there are in this building? What the fire ratings are of these various doors? Have I missed any components out?” That sort of thing.
[00:01:45] Geoff: Okay. Excellent. These types of customers that you’re working with they’re storing a lot of data obviously, and complex data sets, right? Applications that are in charge of this data. They probably want low latency access to it. What kind of trends are you paying attention to most these days when you’re working with your clients?
[00:02:10] Marcus: The obvious trend is the cloud. The move to the cloud, as internet connectivity is getting better, people are getting more confident in moving services off-premise. The big shift recently has been from on-premise, collaborations, emails which is exchange into Office 365. That’s been followed up by a move to more software service solutions, so things like Box, and Dropbox and other collaboration suites. Also now people are getting confident to start storing their data in the cloud. That’s backups traditionally, but more just now in archive and live assets.
[00:02:43] Geoff: That makes sense. We’re seeing that on the other side of the pond here as well. Specifically with your AEC customers what types of IT challenges are really keeping them up at night?
[00:02:54] Marcus: It’s great for us as an IT company, because architecture particular are heavy users of IT infrastructure. They’re generally generating a lot of data, so before they even start to project the marketing people, to creating big in-design PowerPoint presentations to win the work. They’re very heavily graphical based. Then, once it’s got on the design process you’ve got the architects who are creating the design documents, which are now these complex databases.
Then, they move over [sound cut] working with people who do [sound cut] start generating large data sets. Particularly for us in the AEC industry. There’s quite a lot of the regulation around the information they produce. For example what a lot of the practices are working on, retaining drawings and other building related information for more than 12 years after building completion. If there are issues with the building people want to come back and have a court case about some irregularity in the building process.
Or if something falls down, worst case, then they need all that information to defend themselves with, insurers insist on that. Architecture as well and engineering, the products aren’t short term, they’re very long term. You could be looking to three to five years for the duration of a project, so that data needs to be kept round for quite a long time. The other issues, people always want to refer back to work they’ve done before.
You might deliver a project which has got a building and the client could come back to you years later and say, “Okay. I want to add an extension,” or, “I want to do some work. I need to understand that building,” so you need that data back. But, in the creative industries people always want to refer back to projects they worked on before, and have a look at the information. You can’t just say, “Okay. The project is finished now, let’s move that into an archive,” then we need to get that information back.”
[00:04:40] Geoff: Yes. That’s right. We see that as well with PR and marketing companies. They tend to do project based work, and they tend to have data retention requirements that are specified in their customer contracts. I think with the AEC customer, 12-year retention, is that standard? Are you sometimes seeing based on the type of work that it can be longer? I think with some of our clients, we’ve seen up to 30 or 17 years. I want to say 30, but I could be wrong, but 17 years is another number that I’ve heard.
[00:05:12] Marcus: Yes. It depends on the insurance that they’ve got as well. But some of these buildings they’re putting out could be around for 100 years, so people, even if it’s not some sort of insurance generated thing, if you just built an airport, it’s quite likely in 20, 30, 40 years, they’re going to come back to you and ask you some of that data. Another interesting that we have is that we need versions of data, a point in time as well.
So more than be able to say, we can get the data back from a backup or an archive as it was. At the end of the project, we need to be able to show how things have developed through time, what information was issued to be blown in time. So it’s more, “What do these look like in December 2012, versus what do this in December 2017?” is an important question we need to be able to answer.
[00:05:59] Geoff: If we think more specifically about the data management challenges that this type of organization faces. They obviously have the remote branch office scenario, and remote job sites. They’ve got to think about data protection, putting IT services out on job sites and data protection around that. Then as you’re talking about, they have this long-term retention component as well, which can be a challenge obviously if they’re not handling that with an archiving system. Or, if they’re doing what many companies have done today, which is, well, we’re just going to have our primary storage and back it up and replicate it. The three, two, one rule, and not tier off this longer retention data. It can all become quite cumbersome.
Specifically, in the data management area, how have firms traditionally tackled that problem and what trends are you seeing there?
[00:06:53] Marcus: Well, unfortunately, until about five years ago, archives seem to mean offline to a lot of people. The traditional solution would be to, again, you have all of these parties to agree that the project is finished. That is a job in itself because everyone — Even if the building is finished, people still say, “Well, I need to prepare back. I’m still listing documents out to people.” Then it would just get archived to tape, and that’s the very worst sort of archive you can do.
Because the tape goes off-site, it’s lost in a catalog somewhere. Discoverability is just really hard problem to solve there. In more recent years, we seen a trend towards Nearline Storage On-Prem. So instead of it going offline, you just have to move it across a cheap storage. But even doing that, problems we have particularly with the applications we use such us InDesign and AutoCAD, is there are references between the documents.
So you suddenly break all of your references by moving stuff into archive. If people can’t even access data in place, I mean, if it was available to them, because when they open these drawings, or the references are broken. What we would like to see is an ability, no, but archiving not happened, but we need a way to get more — We can’t keep on buying more, more storage, but we need an archive which is in place if we can.
[00:08:08] Geoff: Interesting. With cloud technologies, how are you seeing your customers adopt cloud technologies and how is that bringing about some sort of transformation then?
[00:08:22] Marcus: People are looking of how to use cloud. Once people move to Office 365 it’s, what are the next steps? Backup becomes an obvious one, so people have been using tapes before. They’ve done a great job over the last 60, 80 years. To be honest, tape capacities and kept up, and take speeds and kept up. But managing the backup processes is the vein of every IT managers life. You log in every Friday to make sure it’s all ready to go. You log in every Monday, you find that the half the jobs didn’t work, because the tapes were emptied or whatever.
I’ll say, a lot of the tape drivers, and a lot of mechanical thing in the server room, so whenever you’re — That’s the thing it breaks down the most. There are quite a few backup solutions out there now which replicate that tape-based solution, but just push the data into the cloud. We’re seeing some uptake of that. We looked at simple solutions towards just copying data into your archive into the cloud, but then that becomes really difficult to use. The discoverability issue comes up, how do you know what’s in the cloud, and where to find it?
The latency is poor. People are trying to bright around and find that and it just doesn’t work for them. Where the data is — I suppose it’s a question where the data lives, isn’t it? With email, when you can move the email into the cloud and make the tools work off it, so move it off to Office 365 works really well to people. If you can start to engage with software as a service type, that’s where the data is naturally homed in the cloud.
That also works really well, because you’re essentially logging into that remote system and accessing it. Those problems have been solved by people, to a greater degree. There’s Box who are great for your online, sort of, internet and accessing information. Now we’re looking at, “Hey, look at file system. How do we see that going into the cloud? People are looking at cloud storage gateways as well.
They’re quite a large-scale investment though, so that they’re quite expensive. They are a reap and replace. You take your Windows File Server and you replace it with a Unix based style server, for most the cloud storage gateways we’ve looked at. That doesn’t work for a lot of people. Most people are more comfortable with the Windows File System rather than the Unix System underneath.
[00:10:35] Geoff: Right. Because they’ve made all these investments in their IT infrastructure, and the way that users are used to working and that’s a big shift. It’s kind of funny because you’re getting into the cloud, you’re putting this new appliance on-prem, right?[laughs]
[00:10:52] Marcus: Exactly. Yes.
[00:10:53] Geoff: Add new infrastructure, and you’re in the cloud.
[00:10:56] Marcus: Yes.
[00:10:57] Geoff: I think that’s a common gripe that we’ve heard about cloud storage gateways too. They’re great solutions but for some companies, but the math doesn’t always compute. That’s quite interesting. I think, we’re seeing the same thing with email where Office 365 adoption seems to be the first step into the cloud, and File Servers seem to be the last sort of thing. We have a customer for instance that used us [HubStor] as part of their transformation to a cloud only initiative, right. Not just cloud first, but a cloud only.
When they contacted us it was, “We basically moved everything to the cloud except our file servers and we have no idea where to go with them.” I think you were alluding to this is that the common thinking is, well stick them in blob storage, or file storage in the cloud. Then the realization happens where it’s, well, this is really just infrastructure as a service. What I’m really looking for is that SaaS experience where users can sign in, get at their own data, maybe there is some search capability. And we can easily get things out.
But, when you just go and say put it to blob storage that’s not the case. It’s infrastructure, it’s very low level. You either need to roll up your sleeves yourself and do some development work to build the connectivity there. It also sounds like regarding the cloud strategy that AEC companies are probably going with a multi-vendor approach then, right? It’s not one vendor that’s going to take them to the cloud completely.
[00:12:47] Marcus: No. Absolutely, I mean maybe in 10 years one person will dominate with a monopoly, but so you already- [crosstalk] [laughter] In the email space there’s Office 365, or G-Suite are two vendors that we have to work with both. Because particularly the creative studios have a preference to G-Suite, whereas Office 365 seems to be more dominant in architectural industries.
There’s a number of vendors who look at collaboration and software. I’ve mention Dropbox and Box, but there are many others in that space. Of course, Microsoft is the big player and they’re developing their own solutions. They’ve been releasing some stuff for us, yes. All this cloud storage gateway providers go through [sound cut] as well.
I think it’s important to look at how you can bring together the best of breed of all of these solutions, rather than just go for one particular vendor who isn’t going to solve all your problems. Have a look at the problem you’re trying to solve specifically, and look at the best solution for that. Because, it’s all cloud at the end of the day, so you’re not particularly locked in one vendor there, so in terms of your solutions have a look at those different providers.”
[00:13:52] Geoff: Yes. That’s an interesting angle, and I agree with you. We see some of the cloud storage gateway vendors for instance are really — it’s almost becoming a boil the ocean sort of sales pitch that they’re trying to sell. Which is, “Hey, it’s not just collaboration. You’re going to put these filers, these cash filers on-prem and it’s going to give you a global file share. Your users can then work on projects together. It’s like everyone’s seeing the same stuff.
It’s like “No. Actually we want to also replace all your existing infrastructure, it’s continuous back up that’s built into it and we do archiving, and we want to take over the whole thing, right?” I think at some cases that can make sense for some companies. They’re willing to do that and they’re at that stage, but in many cases companies have brought Office 365. They have things like OneDrive for Business. They already have a file sharing. They might also have something like Box or Dropbox.
I think in my opinion the cloud is great for archive, backup and DR. That’s where it really wins. It can really come back to the physical issue of network connectivity, the latency, and AEC customers based on the type of data that they’re working with the applications that generate and need access that data, they don’t deal well with latency.
That’s why you need these caching mechanism or on-prem, but I don’t think everyone has the appetite to go into the cloud storage gateway world, because of the upfront cost, and your total big shift, and your infrastructure and managing this new hardware. I think that with regard to cloud strategy, it’s easier to — Like you say, you’re starting with the email, probably, you’re in the Office 365 or Google. You’ve already made an on-prem investment, that infrastructure probably still has some life cycle left to it, I’m sure varying stages.
The idea here is, and I just want to throw everything in the garbage and start fresh. It’s I want to dip my toe in the water. When you’re thinking about your clients and shifting to the cloud, are you recommending let’s do a whole big rip and replace? I guess not. You’d probably talking about let’s dip your toe in the water and wait into this thing and go on strategically. What do you see is being the easiest low hanging fruit or the quickest win when you’re talking to your customers about cloud adoption?
[00:16:25] Marcus: Well, it’s interesting. Just to go back quite quickly, disaster recovery is being revolutionized by this. I’ve been doing a disaster recovery review with one of our clients recently, and it’s so different to 10 years ago. Because 10 years ago, the office building burning down would have been a disaster, really, it would have been a disaster, and no we’re just considering how we can get enough desktops to people to start working again, because the majority of the data is in the cloud and ready to go.
We’re starting that process that move towards the data being so hanged in the cloud by default and then copied into the office, the performance, because the latency is still an important consideration. It’s brave, not brave, you have to be quite bought into the whole process to take your existing file structure servers and just take them out and copy them to a new solution. And how are you’re going to make sure that works, that the performance is there, how are you going to get your data back-up if you don’t like solution.
Particularly the size bit of the companies that we work with, which are like 40 to 150 strong and there’s a level of comfort with the Windows File Server, so we’re saying, “Okay. What’s the easy steps you can take towards the cloud with file servers, moving the files into the cloud?” I think archiving, well backup is the first one, so get rid of the tapes into your backup to the cloud, and then the next logical step process is archives. So how can we start to archive data in a way that it’s still discoverable, it’s easy to find people?
We don’t have to go to this whole archiving process. Can we do some in-place archiving? That says, since you start cloud tiering, so that data that you might not using very much is moving into the cloud. You don’t have to invest into replacing our storage all the time, because we keep getting the limits of it, we can just keep pushing more and more into the cloud.
Then we’ll probably find naturally, maybe 90% of our followers now are now in the cloud by default because that’s the strategy we set. It’s solving that problem for us in IT, so we don’t have to be badgering people about archiving stuff all the time, and making people happy that they’re not seeing their stuff disappearing off into the archive black hole every year.
[00:18:31] Geoff: That’s a good point. That’s a neat way of going about it, and as we’ve seen that cloud tiering, it sounds scary, but it really isn’t. Provided that A, the cloud solution that you’re tiering to has a nice exit plan. Yes, you can get your data out easily. You’re not being locked in to this thing. It’s not dark data in the sense that users can’t get out of it if they needed, but because you’re tiering data as it ages, it’s a pretty safe play, because you’ve kind of validated that this is old, low-touch data; a great candidate for the cloud.
Things have to be qualified to be cloud ready before they go. But the other side of the coin here is that on-prem, you start running lean and mean. Which means that backups aren’t what they used to be. As you said, we’re not adding more and more storage until we run out of slots. That’s a neat picture, and I think tying that back to the conversation we were having earlier around tape, and how customers have traditionally used tape.
I think that a lot of organizations are wrestling with this issue today, where does tape fit into the strategy, does it still play a role if we start going into the Cloud. There’s this expectation that cloud is a replacement of tape. It’s a neat discussion to have because these are apples and oranges really. Both are media that can solve your long-term retention problem. But they are two totally different models, both from a technology point of view, but also a cost model point of view. Any thoughts on the comparison between tape and cloud?
[00:20:15] Marcus: What immediately came to mind is, in that sort of disaster recovery, and it doesn’t have to be the office burning down. It can just be the drive, failing on your server, which we’ve seen less off these days, and you’re in a position where your whole project drive is offline and you need to recover it. Then with tape you’re like, “Okay. Well, we can recover the whole tape.”
Because to be honest just trying to find the subset of projects, which are most urgent in your whole tape, backup tape just as long as you’re storing it anyway. So everyone’s sitting around six to eight hours while your tapes churning away restoring that data. With cloud storage gateways and technologies the HubStor have, you can actually recover really quickly.
You can say, “Okay. Just pull down the placeholders that you need for this data, everyone starts working and starts pulling on demand the files that most urgent to get people up and running again. I see that’s really quite an important distinction between the two. It take us there, it’s been a great back slot for this time, but it’s cumbersome to get even, you need one file back, you have to wait the tape to come back from the warehouse, you put in the tape drive. You wait while the tape struggles to it, whereas you pulls that one file off from the cloud it’s immediately available.
[00:21:17] Geoff: Yes. That’s a great point. I think that so that you have the faster RTO probably greater agility around recovery scenarios and I think the other interesting angle is around cost. We were talking about this the last time, that Cloud probably isn’t going to outperform tape in terms of cost, it’s going to be more expensive. If all you care about is pricing then stick with tape.
[00:21:43] Marcus: Yes.
[00:21:44] Geoff: I think the– where cloud really beats out tape is in the convenience factor and that convenience manifests in multiple areas. There isn’t this robotic tape library that I’m in charge of, I’m not having to maintain and manage that infrastructure and the processes that it facilitates. The other angle, of course, is when you’re talking long term retention — I was talking with my wife at breakfast the other day about our family photos, and I said, “When was the last time we backed this stuff up?”
“Are you still doing that routine, where we’re taking the data off the hard drives and getting it onto optical and then creating another copy that we off-site to your parents place?” She said, “Yes, but I haven’t done that in four or five months,” and it’s like, “You know what? We need to get a HubStor tenant, for our own family to put all of our family records and family photos and stuff. Because I think this is data that we’re going to want to keep for our lifetime.
[00:22:43] Marcus: Yes.
[00:22:43] Geoff: And pass down generations. And what I don’t want to have to worry about every five ten years whatever it is, refreshing the underlying media. So if you have a customer that has 12 year retention requirements, they’re starting to stretch the lifespan of tape for long-term retention, are they not?
[00:23:01] Marcus: Absolutely, yes. That’s a really good point, because you could consider the cost differentials being tape and cloud, and you think, “Well, the tape is quite cheap,” but actually if you look at the long-term price of cloud storage it’s going down, and down, and down. For tape to keep it for 12 years, it needs to be in a properly maintained environment, so we use professional tape storage, and the cost per slot on the shelf stays the same.
So you might have an LTO1 tape on there with a couple hundred gigs, and you might have an LTO6, which is got a couple of terabytes, but actually you’re not seeing any time or savings. One of the things we’re dreading [sound cut] only calling media from those storage reading it in and rewriting it into more modern media to shrink the size of the data, and the tapes we’ve got there and just sort refresh that media.
With playing that problem goes away, because I think the trend is going to continue and it’s already crazily cheap per gig, to compare how much hard drives used to cost and can only surely go head on down. That problem with long-term archive the cost goes down, and down, and down with cloud and I don’t think you see that with tape, if you need to store them professionally like we need to.
[00:24:03] Geoff: Yes. I think the nice thing with the cloud too is you have things like erasure coding, which is basically RAID on-prem, but it’s in the cloud. So you have the data integrity checking that’s happening, that’s self-healing, it’s redundant. You essentially outsource that whole problem of hardware refresh, right. Which I think makes a lot of sense if you’re talking data that needs to be kept for a long period of time.
[00:24:27] Marcus: Absolutely, yes. Make it somebody else’s problem, that’s what IT managers like, yes.
[00:24:31] Geoff: Yes. My last question for you then, Marcus is, your customers are contemplating the cloud, they might have some experience maybe not. But when you’re advising them, what are you telling them are some of the top things that they should be considering or evaluating, looking out for, when they’re looking at going to the cloud?
[00:24:54] Marcus: The first one and sometimes they don’t like it, it’s particularly the size companies that we work with. The cloud probably isn’t going to save them any money. People quite often come on along thinking out that cloud’s going to be a lot cheaper, and I like to say to them, “Is going to give you a better solution.” Some of the things we’ve talked about today, the flexibility, the long-term recoverability of stuff, the disaster recovery, are all going to be big positives you’re going to get moving to the cloud.
Don’t expect to see you bill hugged. Bigger organizations might have more efficiencies and they might see some savings there, but I think look at the cloud as providing you with a better way of working rather than as a money saving thing. When talking to the financial directors you can sometimes make quite a case for move from capex to OpEx. If they want to start investing upfront in large amounts of storage arrays, and servers, and can be stats not quite offseted, but spend that money over a longer term that works for them.
On a slightly negative side, I always say to them, “Think about how you can get out of the situation.” When you got on-prem stuff it’s been a natural progression to move. You could change your server provider from HP to Dell, on the next refresh you can move in Windows to Mac if you want to on the next refresh. Once stuff is in the cloud it might be in a format that you can’t do anything with.
Just make sure there’s some — to bring it back out again. I think on-prem more positively than negatively, I think the cloud is very exciting now for the size of companies we work with is offering them a lot of great opportunities or reducing some costs and having much better solutions in place, such as their archiving, manage disaster recovery.
[00:26:29] Geoff: I think those are all great points, those are all great points. Your clients are lucky to have you.
[00:26:33] Marcus: Thank you.
[00:26:36] Geoff: All right. Let me see here. I want to just bring up our closing slide, if that will work. I’d like to thank you for making the time to do this interview with me. I think it’s been a great discussion and let me just bring up our website. If you want to learn more about Nittygritty, the website address is https://www.nittygritty.net, and HubStor is https://www.HubStor.net. We’re not the only ones with a great company with a dot net website. Marcus thanks again for your time and I look forward to speaking with you again soon.
[00:27:12] Marcus: Thank you Geoff. Really enjoyed it. Thanks a lot.
[00:27:15] Geoff: All right. Thank you.
[00:27:16] [END OF AUDIO]