Data leaders powering data-driven innovation
From the onset of the COVID-19 pandemic, educational institutions had to quickly make the shift to teaching fully online. In this episode, Kate Carruthers, Chief Data and Insights Officer at the University of New South Wales Sydney, discusses how she’s helping transform the university into a data-driven organization. Kate and her team are delivering new insights to instructors and students, rapidly moving pilot applications to production, and creating innovative ways to combat new threats that challenge the sanctity of the code of ethics between students and the university.
Kate has extensive experience in senior roles in ICT across the finance sector, marketing, data, and digital. She is a member of the NSW government’s Data Analytics Centre advisory board. Kate was appointed to the Microsoft Regional Director program for her work in cybersecurity in 2020 (this is an external advisory role that provides Microsoft leaders with customer insights, real-world voices, and insights).
She is currently working at the intersection of data analytics, AI, ML, privacy, cybersecurity, and data protection.
Speaker 1:
Welcome to Champions of Data + AI brought to you by Databricks. In each episode, we salute Champions of Data + AI, the change agents who are shaking up the status quo. These Mavericks are rethinking how data and AI can enhance the human experience. We’ll dive into their challenges and celebrate their successes all while getting to know these leaders a little more personally.
Chris D’Agostino:
Welcome to the Champions of Data + AI. I’m your host, Chris D’Agostino. COVID-19 has had an unprecedented global impact affecting almost every aspect of our lives. In the world of data and AI, it has forced an immediate digital transformation for many organizations. For educational institutions, such as primary schools, K-12, and universities, they have had to make the shift to fully digital teaching in just a few short weeks.
Chris D’Agostino:
Today, I’m joined by Kate Carruthers, Chief Data & Insights Officer at the University of New South Wales out of Sydney, Australia. Kate discusses how her diverse career and education and anthropology is helping her transform the university to become a data-driven organization. Kate, welcome. And it’s great to have you with us today. So, education is really an interesting subject, especially in light of everything going on in the world with COVID. There’s over a hundred million known cases as of today. With COVID, it’s transformed industries. In some cases, those organizations, companies that haven’t been able to adapt have shut down. Others have had to move to a digitization front very quickly and go to virtual kind of environments. So, true is with universities. And so, I’d like to just get your response to like how has the last year been for you?
Kate Carruthers:
Well, our institution, like everyone else, has had to deal with lockdowns and people not being able to attend campus. And we, at the start of last year, I was talking to some of my colleagues in the School of Computer Science and Engineering and they were going, “My course is special and unique. It can never go online.” Three weeks later, that course was online as was every course we delivered last year. So, COVID has driven the pace of digitization in large state institutions like universities dramatically. And that’s been in response to the need to educate students in spite of the COVID lockdowns
Chris D’Agostino:
And for the courses, like some of the courses that are more lecture-based I would imagine were obviously easier to facilitate moving to online. What about the ones where, maybe in like chemistry, where there’s equipment involved and laboratory work, how did you handle that?
Kate Carruthers:
Well, that’s actually been a real challenge for us because there are a number of courses, particularly in engineering. I have an academic appointment in the engineering faculty, so I’m really conscious. We’ve got a lot of things that you have to do in labs. You need to be physically present and students can’t graduate without doing that. So, we’ve kind of just kicked that can down the road and we’re assuming that we’ll be able to get students back in the classroom for labs using safe distancing procedures for COVID this year. We did a bit of it late last year. So, in Australia, the COVID numbers are very low. So, we can actually bring our students back into the labs now. Lectures for courses will still remain online, but labs will be starting back this year.
Chris D’Agostino:
Oh, and so perhaps maybe if it’s been sort of fronted with this past year with a lot of lectures, maybe once students returned to on campus, the lab work that they’ve missed, perhaps you’ll concentrate that a little bit more moving forward.
Kate Carruthers:
Yeah, that’s our idea. Although in medicine, we’ve been experimenting over the last few years with certain things like we don’t use microscopes for a lot of the courses because when you do things with microscopes, everybody has a slightly different sample. So, what we do is we use digital samples so that they can all see the same thing and pick out the same characteristics. So, it’s been an interesting thing over the years, but you still can’t replace the in-person lab. If you’re doing civil engineering, you need to crush concrete, you need really need to do that. You can’t just imagine that or see it on TV.
Chris D’Agostino:
So, Kate you’ve mentioned that the case loads are generally low in Australia, but the government and the country have taken some precautions to make sure it stays that way. And as you start to reopen and you look at this past year, how has your role within the university changed due to COVID?
Kate Carruthers:
Well, the interesting thing is it hasn’t really. What we had to do though, is we had to recut a lot of our reports to provide insights. We had a situation where I, our international students, who were very big proportion of our revenue, were not able to get into the country. So, we had to shift them to studying online. And we needed to understand who was in the country, who wasn’t in the country. And we never had to worry about that before because the students used to just turn up in class on day one of the term.
Kate Carruthers:
But we had to work out where people were because nobody was on campus. And so we had to recut a lot of reports and provide new insights to people. And that was really challenging for us. And more generally I think across the higher ed sector, the universities were very comfortable and not digitizing very much, but COVID is kind of turbo-charged that now, because we’ve all had to, we’ve all had to confront the fact that we needed to deliver online. So that was really a significant shift for us as institutions, but literally my job hasn’t changed because of COVID, it’s just made everything exponentially faster. And I think a lot of people have found that about COVID that it’s made the shift exponential.
Chris D’Agostino:
So, Kate would love for the audience to hear a little bit more about your background, kind of your career journey, where you’ve… I found it interesting when we were first getting to know each other that your path to the role that you’re in and the leadership role that you have isn’t perhaps your traditional computer science path. So, would love to hear a little bit about your background, something people wouldn’t see on your LinkedIn profile.
Kate Carruthers:
Well, probably the way that I started out was when I was an undergraduate, I was an arts students. So, I was doing a history, anthropology, and philosophy. And that’s probably been the most important thing for me because it really formed my ability to communicate and write and to think. And I only discovered computers when I went to work after university and I discovered an affinity with them and started to get more and more involved with them. And my first job in IT was I got it because I was in the kitchen of the National Trust, which is a charity that looks after buildings and stuff. And I mentioned something was wrong with the computer system to the boss and to the CEO of the organization. And she said, “Do you want to be in charge? You sound like you know what you’re doing?” I said, “Yes.”
Chris D’Agostino:
And then, you had to open up all the manuals, read about it, or you just kind of learned as you went?
Kate Carruthers:
I truly learned on the job. And I was reading up and phoning the vendor and making a pest of myself, but I really loved it. And then, I ended up in banking. I ended up at Citibank and my career went on from there. So, I had like a 20 year plus IT career. And I did a bunch of projects on data warehouses for a number of large financial services organizations. And that’s how I got into data. And I was also a DBA for a while. So, I’ve done a plethora of jobs across the IT sector. So, I bring a very wide experience of technology to the role. But I think too, my academic studies have all been in business and management. So, I bring a fairly unique perspective across of business as well as technology and being able to marry the two is probably the most important thing.
Chris D’Agostino:
Yeah. I would imagine your anthropology background makes it so that you’re interested in understanding sort of root cause analysis like what triggered an outage or what triggered data to be used in a certain way? Like looking back in time, how much has that played a role?
Kate Carruthers:
Oh, well because I used to work for GE and I was a Six Sigma Black Belt there. My mind automatically defaults to that kind of thinking. So, there’s a lot of influences in my career that I bring to it now. And I often do Six Sigma kinds of things, but I just don’t tell people I’m doing Six Sigma.
Chris D’Agostino:
Oh, it’s like a Jedi mind trick or something. So Kate, now I’m going to talk a little bit about specifically data and AI and how it’s used at the university and how you work with data to determine student engagement, the success of the online courses, and how students might be able to use the data that’s created on their behalf, their grades and kind of their engagement levels, to figure out if students are on track and things like that for graduating. So, just in general, how has data and AI influenced the university and how you manage the student population there?
Kate Carruthers:
Our original data strategy, which ran from 2019 to 2020 was all about building the foundations on our new Azure data platform. And so we’d done that. And now, we’re looking to the new data strategy, which runs from 2021 to 2022, which is all about doing various proofs of concepts using AI and machine learning to deliver insights. And I’ve just had a chat with the folks who look after our teaching and the student experience literally just before this session where we’ve mapped out a number of co-design workshops with students and academics where we’re going to co-create the way that we deliver information to them. So, we’ve always done the traditional thing that in the learning management system you can display stuff and it’s kind of primitive and it’s kind of not really reach. And we’re going to build on a lot of the work that Dr. David Kellermann has done where he’s delivered Power BI dashboards in teams to the students, and we’re going to co-create that with the students. So, we’re going to create stuff that they actually want to see. We’re also going to do some work around students at risk. So, that’s students at risk of academic failure. We want to be able to identify the characteristics of a student who’s going to fail a component of the coursework as well as students who are at risk of self-harm. So, we’re trying to build up some data points that we think contribute to both of those and we’re going to be then co-creating the dashboards and the insights with the students and the staff.
Chris D’Agostino:
When we first met, you were talking about examples of looking across the university and David Kellermann and trying to basically look for examples of where data and AI has been useful to the university and then accelerating the scaling up of those initiatives. Any other examples that you can think of where you’ve seen something in the university and you thought this would be really great if we did it across all the programs in the university and the need to work with, say the faculty and staff, to get buy-in and support.
Kate Carruthers:
I think one of the big challenges that’s facing higher education at the moment, especially now that we’re all online, is something called contract cheating. And this is where the students pay somebody to write a unique assignment for them. And not only are they breaching their student conduct agreements and stuff, but they also get blackmailed by these people once they graduate. So, it’s a thing that sticks with them for life and it’s problematic for us because we’re potentially granting degrees where the students haven’t done the work and also the students get blackmailed. And so, we’re currently doing a proof of concept now where we’re analyzing our historical data with a hypothesis that we’ve got that we can identify this. So, we’re looking for known… Can we find all the known instances of historical conduct with the contract cheating? And if we can, then we’ve going to apply that prospectively. And our goal with that is we want to actually identify when a student has a propensity to do that and intervene before they do it.
Chris D’Agostino:
Interesting.
Kate Carruthers:
And so, that’s something that we’re working on at the moment and…
Chris D’Agostino:
So, is this a bit of like a supervised learning? So, you take known cases, you try and train an algorithm or train a model rather, and then, now as you get new data coming in and new assignments are submitted, you’re now looking for characteristics of those submissions that might match?
Kate Carruthers:
Yeah.
Chris D’Agostino:
Okay. Cool. Very cool.
Kate Carruthers:
Yep, that’s precisely it. So, we’ve got a hypothesis we’re going to run it. We’re running it against the historic data to see if we can find the known instances. So, we have team that looks for this. So, they do find students and expel students who’ve done it. So, we have some known instances. So, if we can find the known instances in our historical data set, that’ll prove that the hypothesis is valid. And then, we can use the historic data to train the ML. And then, we can look prospectively. And then, what we want to do is really try and identify what are the precursor behaviors. So, what are the… So, that we can identify the act, but can we also identify the precursor behaviors before they’ve actually done it?
Chris D’Agostino:
Great. And so, are you using Azure Databricks to help with some of this data analysis?
Kate Carruthers:
Oh yes, indeed. Yeah. Yeah. And we’re working really closely with the Databricks team here in Australia because we kind of built our Databricks two years ago and we’re kind of conscious that best practices moved on. So, we’re currently engaged in a rearchitecture of our Databricks so that we can optimize what we’re doing for the new world because what we were doing two years ago was really about reporting. And now it’s much more about AI and ML. So, we want to re rethink how we’ve approached our Databricks implementation.
Chris D’Agostino:
It’s great to hear that it’s actually trying to improve students’ lives as well as prevent fraud and dishonesty at the university. So, some of that must be drawn from your background in banking. I have a background in banking. We used Databricks in my prior life before joining Databricks to combat financial fraud. So, would love to hear a little bit about you must have brought some of that thinking forward. And then, how do you see some of these innovations that you’re working on being applicable to the broader university environment, globally?
Kate Carruthers:
Yeah, my time in banking really did influence my thinking on this. And it’s really interesting how the different experiences you have across different sectors can really be used to inform your experience in a different sector. So, I really did. When the conduct and integrity team came and told me they had this problem, I immediately thought it’s directly analogous to fraud in banking. And I used to work in credit cards where there’s a lot of fraud. So, we had very sophisticated fraud tracking and analysis. And so, I brought a lot of the thinking to that, but that was like 10 years ago that I was doing that kind of thing. So, the technology’s moved on in such an amazing way now that we got AI and ML at our fingertips to detect this, which I would have loved when I was back in the bank. So, we’ve got really great tools and techniques now that we can use to identify this. And it’s a really a great problem for ML to solve because it’s all about pattern matching. So, that’s really exciting.
Chris D’Agostino:
And in terms of like some of these innovations and some of these use cases, do you see them as being applicable more broadly to universities and education globally?
Kate Carruthers:
Yeah, it’s really interesting. Like we have to be very careful when we’re talking about this stuff around fraud detection, not to give away the way that we’re doing it because the fraudsters watch us and listen to us and will modify their behavior if they understand that we’re looking at particular things. So, we have to be very careful, but we do regularly meet. So, all the universities in Australia and internationally, we used to meet in person, but now we meet through Teams or Zoom, but we do meet and we do discuss this behind closed doors. And we do share our findings and our lessons learned. And it’s an important thing because all of us are smarter than just one of us. And by sharing that knowledge, we can all improve our practice. And we really see that as an important thing.
Kate Carruthers:
And the higher ed sector in particular is very collaborative. So, we’re competitors, but we do collaborate a lot. So, we have international and national bodies that we meet with regularly and we do share stuff, but you can’t always directly generalize one experience to the other place because they often don’t have the same technology. Like a lot of universities don’t have all of their data in a delightfully neat data lake with a curated data lake next to it.
Kate Carruthers:
And, so they don’t have that advantage. So, a lot of them have to build up to doing this whereas we spent two years building our platform, getting our data lined up. So, now we can just run ML across our data sets really easily. And that’s why we spent two years doing that because it was if you’ve got all your data lined up, then you can do AI and ML across the top of it. Whereas if your data is all over the place and you don’t know what it is, you don’t understand it, and you’re not clear on it, then you have real trouble doing that. That becomes a really expensive exercise then.
Chris D’Agostino:
Yeah, we talk to a lot of customers as I’m sure, in terms of where Databricks has reach into different verticals. And one of the key things that we’re hearing in talking to those customers is the desire to do ML is completely dependent on the quality of the data and the volume of the data. And so, we’re really proud of some of the innovations we’ve done that are helping people land data in raw format and give them a really consistent set of APIs in order to curate that data and improve it so that it can be consumed.
Kate Carruthers:
That is absolutely fundamental. So, we use Azure Data Factory to lend the data and then we use Databricks to clean, transform, and migrate into the data warehouse and into the curated data lake. And probably the biggest lesson for us was we thought our data scientists would want to use raw data in the raw data lake. Actually, they need curated data. So, I’ll give you an example. So, we put out offers for people who want to attend the university and we put out offers that are conditional. And so, we might say to you, “You can come and study with us, but you have to prove that you’ve got your English up to scratch. So you need to show us your English results.” So, we’ve got all of these conditional offers that go out. Now, data scientists, and they’ve got random alphanumeric codes for all these offers.
Kate Carruthers:
They need those neatly packaged up into a category called conditional offers because they don’t know all of those codes. So, what we actually realized was that the data scientists actually needed curated data in a curated data lake so that they… It was organized to a certain degree so that they could actually understand it. And that was really fundamental for us. And it was a huge shock because everything I’d heard before was that they wanted raw data. They don’t want raw data. They want some meaning attached to the data, but they want to be able to access it.
Chris D’Agostino:
See, I mean, it sounds like, Kate, you’ve spent the last few years really just building out that data platform to provide whether it’s the raw data or as you say, with the data sciences scientists in your community, wanting that curated set of data. Just that platform, and then platformed to be able to take in data, move it through a set of curation steps, and then feed these downstream use cases. So, it sounds like you’ve been busy for sure and you’ve got some interesting work ahead of you.
Chris D’Agostino:
Would like to maybe shift gears here now and talk a little bit about people who are aspiring to do the type of role that you have, somebody that wants to lead data and AI organizations, what advice would you give him or her for what things to look for in their career? What opportunities to be seeking out? Because as you said, you come from a non-computer science background, but you’ve married up a way of having critical thinking with some business skills with some banking context. And you’ve been able to take all that information and apply it to the role that you’ve got, would love to hear what kind of advice you’d give either your younger self or somebody that’s aspiring to embark on a data and AI career path.
Kate Carruthers:
I actually give the advice of look for interesting work with smart and nice people because you’ll find if you’ll do those things, if you’re looking for interesting work with smart, nice people, you’ll learn so much in that context. And I also recommend to say yes. Say yes to opportunities that are slightly off. That’s how I ended up with my first data warehouse project. I knew nothing about data warehouses, but I said yes to it and my first data warehouse project. And I discovered it was really interesting. And I also discovered I was the world’s worst metadata modeler, but that was another story. But if you just say yes to things that sound interesting and try to do work with smart, nice people that’s what I see.
Chris D’Agostino:
What are your thoughts on where you see data and AI going in education in three to five years?
Kate Carruthers:
So, I’m not just the Chief Data and Insights Officer at the university. So, I’m also a senior lecturer in computer science and engineering. And I’ve just done my master’s thesis on digital transformation and higher education. So, I do have opinions on this. I think we’re going in a very different way. So, we don’t want to make students feel that they need to eyeball a lecture. In fact, we’re trying to rethink the nature of a lecture. We’re trying to rethink how we educate and how we can engage and how we can chunk stuff down. So, I’m kind of predicting that the lecture may not exist in its current form for much longer. And interestingly, some work that I did with a colleague in mechanical engineering a couple of years ago, where we took all of the content learning so, you need to learn these these rules and be able to apply these formulas to these things.
Kate Carruthers:
We took all of that out of the classroom. And we digitized that in small chunks and we turned the classroom experience and we were left with, oh, what do you do in the classroom when you don’t have any of that content to teach them? Well, actually, you start to do experiential learning with them doing group work, which is what you do in real life. So, we are starting to experiment with new ways of teaching old material. So, engineering is a very well understood body of knowledge. And there’s very traditional ways to teach it, but we’re just starting to re-imagine that and rethink that. And AI is going to let us see how students are interacting with all of the material, all of the content, and we’ll be able to recalibrate that material on the fly. So, this is building on some of the work that Dr. David Kellermann’s done and my other colleague.
Kate Carruthers:
And so, what I’m imagining as kind of the future is is that we will chunk up their learning much more and things will be delivered in bites. And we will recalibrate those bites. And the things like the quizzes and tests, we’re going to automate all of those because I hate marking. So, we’re going to… It’s one of my dreams that hand marking goes away. So, David’s done a lot of work to be able to automatically Mark freehand drawing. So, students can draw with pen and paper and upload it and then, we can automate the marking of it.
Chris D’Agostino:
Interesting.
Kate Carruthers:
So, we are going to remove a lot of the labor that teachers used to do around marking and stuff and reshape it into our engagement with the students because students actually come to university because of the staff. They come because they want to be with these amazing teachers and stuff, and they don’t get much interaction. They’re sitting there passively in a class. They’re sitting there passively not engaged. We actually want them leaning forward, participating, engaging with the teacher, not just sitting passively receiving a lecture. So, I think education is going to get reshaped in really interesting ways in the future. And it’s all going to be enabled by AI and ML and interesting data-driven technologies. And that’s one of the reasons I got into data because I could see that data underpins the digital transformation that we’re all going through. And I genuinely think any business that’s not digital now is probably dead. So, I really see an interesting and exciting future in education, but it needs to be driven by sound pedagogical principles.
Speaker 1:
Thank you for joining this episode of Champions of Data + AI brought to you by Databricks. Thousands of data leaders rely on Databricks to simplify data and AI. So data teams can innovate faster and solve the world’s toughest problems. Visit databricks.com to learn how data leaders are unlocking the true potential of all their data.