>> From the Library of Congress in Washington, D.C. ^M00:00:04 ^M00:00:13 >> Carson Block: Right? All right. Thanks so much. Welcome. Good afternoon. I hope everyone had a great lunch. And, you know, it's so bright up here and so dark out there, if you fell asleep we wouldn't know it. But we're going to do our best to make sure that nobody in here falls asleep because this is going to be an excellent conversation. My name is Carson Block. I'm a Library Technology Consultant on this Tech Trends Panel. >> John Resig: I'm John Resig. I work at Khan Academy but in my spare time I do a lot of work with libraries. I'm a developer, my background. >> Alison Macrina: And my name is Alison Macrina and I run the Library Freedom Project which is a privacy activism organization for libraries and their communities. And I also work with the Tor Project who builds privacy and anonymity software. >> Carson Block: Outstanding. We've got a great set of panelists. We -- >> Alison Macrina: Shoutout to Jaimee. >> Carson Block: -- also are missing somebody, yes. Jaimee Allay [assumed spelling] was not able to make it. She was ill and couldn't make this session. So we miss her greatly but we believe she's on the Twitter and tweeting things and including a link to a resource sheet after this session. So please look at that as we go. Thinking about technology, we're going to have a very fast-paced conversation up here even though we only have 45 minutes. We decided we wanted to cover everything. Is that okay? Everyone like everything about technology? Great. Okay. So it's going to go kind of fast. One thing to remember though is even as we talk about things that are granular down in the field, tech changes -- we have to view them whether they're a fad or a trend, right? That's something that we're always trying to sort out. One of the things that Jaimee brings to the panel that I want to talk about is the discipline of strategic foresight, and that's the idea of looking at how many factors kind of join together to really help us make some predictions or make some good choices as we go forward. So we don't look at technology in a silo, but we think about all these things together and think about what might result including social, technological, economic, environmental, and political. That's an acronym called STEEP that's very handy when you're thinking about the big picture. The other thing to remember is there's lots of churn and -- within the technology community. And one of the bad things in public libraries specifically -- not everywhere and certainly not at this conference -- but in public libraries a lot of times we look at that churn in tech and we think, oh, gosh we're losing something as things change. The rest of the world actually sees that churn and they go, ah, we now have new opportunities. So that's why we want to encourage you to do, is when you see churn and things changing that also means that you've got a brand new opportunity to pursue. We have two themes today, and we will have time for questions and interaction at the end. One is around data. We all love data, right? Data's big. It is huge. Data is important and data's scary. Also looking at cultural institution, heritage institutions in particular, and some of the tech trends affecting that. So we're going to start right away with data. And as we all know -- we've worked in institutions that are data creators. We've been creating data probably our entire careers, right? Well, now, everyone is creating data in our world, right? If you have a device on your wrist to measure your biometrics, your movements around the world, that's one sort of thing. There's that concept of the quantified self. But we also have the so-called Internet of Things which are the little -- lots of little things in our environment, some of those with sensors collecting lots of data and creating data pods. This is happening more and more and more. And this is affecting us because within a lot of this data collection we have an expectation of two things that are kind of in conflict that we're going to discuss. What is an expectation of privacy, right, about our data? And having it not be exposed or available to other folks. The other is an expectation of high service that comes from analyzing that data and understanding patterns, for instance, in our lives -- where we're going and at a lot of degrees what we're doing. That kind of frames part of the nuance of the issue. You're going to dive into that because we have both things happening -- expectation of privacy and an expectation of higher levels of service using data to serve customers. So let's talk a little bit more about data mining, John. >> John Resig: Yeah. So, I mean, one of the area that I, in general, that I'm really interested in is ways in which we can use computers to automatically analyze large amounts of data, the sort of things that would be very, very challenging to do for us mere mortals, I guess. But the -- there are a lot of things that I think are interesting because like I personally, I do a lot of work with different machine-learning algorithms is finding ways -- you know, ways in which these computers can make determinations about things that, in aggregate, that would be, you know, looking at hundreds of thousands, if not millions, of records of things that would take a human an exceedingly long amount of time. And I think the point you were touching on before about sort of tech trends and -- another thing I would put in there would be like sort of a pragmatism and -- because one of the things that's tricky is that at least when you're using things like machine learning algorithms, those things are extremely technically complex. You usually have to have someone with at least a degree in computer science who's capable of working on these sort of things, and it's not the sort of thing that you can usually just kind of drop in and things magically get sorted out. It requires a lot of, you know, very technical staff and a lot of training. And one of the things that I think is interesting is -- one of the problems is actually really hard to determine is when do you need a magical computer system to solve all your problems? ^M00:06:12 When do you just need one single domain expert to kind of go in and just figure things out? And then like when do you need maybe just like a lot of crowdsourcing to happen? And like these are all like different steps. And it's sometimes it's very hard as an organization to kind of say, oh, you know -- because I feel like people like to make the fast leap to the most technologically cool thing they can. So it's like, oh, we can just use this amazing algorithm and everything will be amazing. But then there's like, well, what if we just hire, you know, this person over there and they will just do it and because they know this stuff really well, and they -- we don't need them like to develop a brand new algorithm to do it. So, yeah, I guess this is one thing. Like I'm excited about but I'm willing to hedge my bets and I'm willing to say that, you know, like it's -- we need to consider this very, very carefully. >> Carson Block: Absolutely. It's like the right tool for the right job, and I think raising that question is really important. And sometimes raising the question of should we do it at all. >> Alison Macrina: I was just about to ask that. I mean, with a lot of these -- you just said something about, you know, like it's sort of this like exciting new frontier. And I think when we are presented with those things, especially in libraries, we want to provide a benefit to our patrons. And maybe sometimes we don't always consider what the possible implications for exploitation would be. You know, with data in that size, I think about a few different things. The first thing is who owns it? Because most of the time when we're talking about like big datasets like that, we're talking about negotiating with giant private companies, sometimes the most powerful in the world. Now, is that who we want, especially as public servants, you know? Is that good stewardship of data? I think that it's not. At the same time who else is going to be able to manage it? Who else can provide us with that level of information? I also think about, you know, are we considering what kind of identifiable information is in that data and how we got it and how it's being shared. Who has access to it? How it's stored. Is it secure? Because when you have information at that level and you connect it to a network, the avenues for exploitation are suddenly massive. It's not just people, you know, possibly hacking it, but, you know, law enforcement requests. If you have information, then it can be subpoenaed. I've seen some really incredible projects with open data, for example, city data. Like I live in Massachusetts and the Massachusetts ACLU has been working on this project called Technology for Justice. A really amazing thing where they've taken open city data and they're mapping policing in Boston and showing how, you know, not surprising to activists, how it maps along racially divided lines and it also maps along access to social services and things like even train routes and stuff like that. And you begin to see like white neighborhoods where they don't even let people of color in because the stop ratio is just right outside. That kind of stuff is incredible. But what are we getting within that? Are we getting the names and addresses of all those people of color who have been stopped by police? Is that information that -- is it in the public benefit? So I want to make sure that we're asking all these questions as we're going into this new exciting frontier. >> Carson Block: And how many people are actually asking those questions as we're -- >> Alison Macrina: Not that many. >> Carson Block: -- going forward [laughter], right? And I think that's kind of an admin -- something that we would like to keep you with in terms of this idea that it's exciting and it's bad. >> Alison Macrina: Mm-hmm. >> Carson Block: It's actually up to us to ask these questions and -- >> Alison Macrina: Totally. >> Carson Block: -- actually to understand that chain that all these things operate in because you were just talking about many different pieces of the technical infrastructure, the physical, the digital, and how many -- so to all the people it's just a cloud. >> Alison Macrina: Yeah, right [laughter]. There's -- I love the cloud as like one of the most effective marketing gimmicks that exists because there is no cloud, right? There's no -- it's not like a beautiful like floating zeros and ones up in the sky, you know? I mean it's a server, it's a physical server that lives somewhere. You know, you have to start thinking about the cloud as an actual physical place. It's a server somewhere that somebody maintains, you know. And, you know, the cloud I think there's too much trust that we put in these ideas of the cloud, you know. And I think we need to be more skeptical. >> Carson Block: Indeed. What do you think as well in terms of -- what are practical ways that folks can go out of this room and go, I think I want to check on dot, dot, dot, dot? Because I hope you're all think that right now. >> John Resig: Yeah. I mean, yeah, I guess, yeah, one of the tricky things that you kind of like need to ask the question like who is benefitting from access to this information? And -- yeah, I guess when I think about like -- when I think of like a successful project that uses lots of data in aggregate versus one that's a lot more borderline, and usually in every case of that -- I can think of one that's successful, it's -- well, there's a phrase in comedy, what -- are you punching up or are you punching down? >> Alison Macrina: Right, totally. >> John Resig: Yeah, like are you -- you know, are you impacting people who, you know, are running the government, or the police, or things, you know, the institutions that were, you know, surrounding us? Or are we affecting people who we should not yet -- like there's no reasons to be, you know, putting their addresses online or something like that. >> Alison Macrina: Right. Public interest. >> John Resig: Yes. >> Alison Macrina: Yeah. >> John Resig: Absolutely. So, yeah. I feel like that is -- that's a really big aspect of it. And I think being very deliberate in realizing who's going to be impacted. >> Alison Macrina: And I think we care about these things, but we don't even necessarily know the right questions to ask. I mean, a lot of this is above our technical paygrade. Many of us working in libraries, even those of us who are fairly technical, like there's no possible way. The whole thing is too big to know, you know. So even being able to go in and say, look, I want to make sure that we do the most ethical things that we can here. How do you start that? >> John Resig: Mm-hmm. >> Carson Block: How do you start it? >> Alison Macrina: Oh, man [laughter]. Well, I mean -- >> Carson Block: Because, because you're very good at starting that conversation. >> Alison Macrina: Well, thank you. I mean, a few things that I mentioned already, you know, like who -- you know, think about data as like having, you know, a physical form effectively. I think that is very helpful to think about it like the cloud is this like nebulous sort of thing. Like it actually has a server location so who owns that server, you know? And what are they doing with it? What's their interest in it? You know, I think about the biggest cloud provider would probably be Google because of just the number, the sheer number of services that we rely on from Google all the time. And Google is the -- one of the world's most powerful companies. They are an advertising company. They collect that information to generate advertising revenue from it, you know? I know, shocker. >> Carson Block: No. >> Alison Macrina: Right. >> Carson Block: It is a shocker. >> Alison Macrina: Ninety-something percent of Google's revenue is from ads. >> Carson Block: But a lot of people don't think about that. >> Alison Macrina: They don't think about it, right. And they don't think about Google having actual servers where Google's engineers have actual access to it, or Google has actual -- has a negotiation with it. Google does a great job at server security, you know. That is one thing that they've really prioritized, especially after Snowden. So that's the second thing, like not just who owns it and who accesses it, but how is it stored, because that's, you know, that physical server somebody could exploit it, and then they -- you have a whole other set of problems. And then it's like, then within the dataset itself, you know, how are you dealing with personally identifying information? What constitutes personally identifying information? Is there such a thing as anonymized data in aggregate? No, there is not it turns out [laughter]. It takes -- >> Carson Block: The devil is in the details. >> Alison Macrina: -- three data points to identify any human. We're all special snowflakes. So, you know, these are -- those are some of the first questions I would ask. >> Carson Block: Very good. >> Alison Macrina: I don't know. What other questions? >> John Resig: Yeah. I mean there are a couple points about -- are really fantastic. And it's like, yeah, when you're using a service for free you have to ask yourself, well, who is actually -- like where's the money coming -- yeah. >> Alison Macrina: Versus -- right, totally. >> John Resig: Right. So if you're using -- an example I'll bring up here because -- I really like Flickr, for example. And I know a lot of institutions have used that to put -- upload the images to Flickr, provide nice little annotations, community engagement. But like that's a great example where if, you know, using it for free you're like, well, at the end of the day, who's the product? >> Alison Macrina: You. >> John Resig: Yeah, exactly [laughter]. The content you're putting up there is the product. >> Alison Macrina: Right. >> John Resig: And it's interesting because like, you know, just the other day like Yahoo, like very quietly announced that Flickr is now a legacy service. Read into that what you will, but the way I would read into it is like I'm getting my stuff off of there in [laughter] -- but the thing is like this -- so this is another issue is like I would -- I'm never going to put any data on a service for free, or actually for pay for that matter -- >> Alison Macrina: Right. >> John Resig: -- that I don't have a back-up for. >> Alison Macrina: sure. But this is the great point that actually that Internet Archive has been doing a project around, you know, a perpetual cloud storage. And that was exactly the angle that they took. They were like, think of all these services that you relied on that now no longer exist and, ergo, all of your pictures and all of your memories. And so, yeah, not just who owns it but what happens when they fail? There's the legacy part of it, you know, maybe it just disappears. And then the other part of it is that data is an asset. If a company goes bankrupt your data can get sold when they get bought out, you know. So these are all different things. It's like thinking about this ownership thing. What happens then, you know, when that like little startup is gone or bought by somebody bigger? >> John Resig: Mm-hmm. Yeah. I guess -- I think one of the things that's extremely challenging here though is that the nice part about free service is that they're free. And that when you have not -- when you don't have an in-house technical staff and when you don't have the ability to run your own servers and do all these sort of things, free sounds really nice. And so this is the thing. I feel like I don't have a good answer for because it's like well, okay, you just stop, don't rely upon this free thing. But the things is that on the other side is like well, you kind of have to have like all these other things. >> Alison Macrina: But there are different kinds of free, right? I think it's totally legit, like we have to think about, you know, what kind of -- what kind of like exchange is happening when we get something for free? Is it -- it's our data, you know, that we're volunteering. I work a lot with free and open source software and to me that is a good -- it's not an ideal solution, but at least the reason why you get the thing for free is that it is a community effort, that all these different people all over the world are working on it, usually as volunteers, and that is a thing that I like to put my trust in a little more than like private company who maybe makes something more sophisticated and nicer. You know, I think about, you know, the difference between like Google Maps and OpenStreetMaps, for example. OpenStreetMaps is a free and open source project that doesn't work as well. Google works really well because they got the little cars that drive around and take pictures, and they're monitoring you when you're in traffic and all this stuff. But I think the first thing is people have to know, you know, what kind of exchange you're making. >> Carson Block: But I think I have a magic bullet answer that no one's going to believe. >> Alison Macrina: Go on. >> Carson Block: We need to invest more in our technology people in our cultural heritage institutions and our libraries. Constantly what I've been seeing again is, especially on the leadership level, they'll confuse free with free, right?' >> Alison Macrina: Mm-hmm. >> Carson Block: So a free open source software is free as in freedom for you to pick up the ball and to do something with it and it's [inaudible] community. >> Alison Macrina: It takes time and -- mm-hmm. >> Carson Block: And that investment actually has been lower. I would say that because of the tipping point that we're at with -- and we're just -- the tip of the iceberg is what we're talking about in this issues. Because of that we need more knowledgeable workers on the IT side, on the technology side, that understand how the technology impacts not just our services that we're giving people, not just what they get, but also how we're doing it. Are we adhering to our own culture of privacy or confidentiality? Are we doing the details and due diligence in that? Right now we're not investing in that in any way, shape, or form. And also I don't think we're cultivating the right sort of tech leaders either. We've got lots of siloed folks who are passionate and leading the ways in these little areas, but not joined enough. And we also have some people that are better at marketing themselves in terms of being wow, gee whiz, instead of really looking at the people we serve, the ideals that we're trying to serve in our communities in carrying those forward. So I'll get off my soapbox on that. But that's a -- >> Alison Macrina: That's a very good soapbox [laughter]. Hire them and pay them a lot more than what we're paying them. >> Carson Block: Yes, yes. >> Alison Macrina: That is dreadful. >> Do your best, everybody. >> Alison Macrina: Really. We should be embarrassed by that. >> Carson Block: Yeah, yeah. That's what I'm saying [laughter]. Very nice [laughter]. But don't feel bad, okay because we've all -- we know what the reality is. Every -- we do up here. Let's move into some specific things about cultural heritage data. We've got lots of content. We have lots of awesome, awesome content. I can't believe some of the great programs here. I could not decide what to go to next because of the awesome content. We also have this tension between the aggregation and the hooks into it so that we can have great content in one pot and share it in so many places. And then the hyper local stuff because as we know, some of this content that we're curating and making available is the most important back home, right? So that's an interesting, that's an interesting tension. And we don't want to leave it in a silo, but it's really used there. The other thing that occurs to me in terms of our uniqueness is that usually technology, we use technology, hopefully, to get an economy of scale. That doesn't happen very often in libraries because of the special snowflake problem. Great libraries are hyper local so that's our initial orientation. So it's very hard to get technology that scales nationally and internationally in the same way. That's really a problem of cultural heritage institutions. Not a problem, but it's a challenge because even our processing is really based on material type, right? I had one job where I was supposed to go in and find more efficiencies between the different types of material type processing, and not only could I not find cross efficiencies that would work, I couldn't find any evidence that anyone else has found that either. When you look at all the standards, they're highly specialized to their material type, right? So a document, for instant, has a different set of standards, requirements, and nuance than a photograph, right, in terms of metadata, descriptive, sharing it out. Many things that everyone here knows about. There's also something really interesting that doesn't get talked about nearly enough, and that's rights. Who owns the stuff? Alison. >> Alison Macrina: Well, it's certainly not us, and it should be [laughter]. I mean, yeah, I mean what's a longer answer than not us? Yeah. I -- >> Carson Block: Honest is good [laughter]. >> Alison Macrina: Yeah. That's like an exchange that we've been making over and over again. Like thinking again about, you know, our inability to provide our own services or like create the kind of tools that we need to engage our patrons, you know, with digital collections and stuff. You know we've moved to entirely third-party models and we've seen some of the failures of this. I mean, the Adobe thing was a scandal, as it should have been. That is just one of these ways that -- and the Adobe thing was a big problem, obviously, because of the privacy breech. But it had been a problem before that, I mean, because DRM is antithetical to library values. The whole way that we've moved away from an ownership model into a licensing model. I mean, we should have never said yes to that, and we did. And I understand like what we were thinking, but now it's very difficult to get the toothpaste back in the tube. From a rights based standpoint, you know, again thinking about, you know, who stores and owns data and how that -- what that means for access. If I am law enforcement and I want to send, for example, a national security letter which is a government subpoena that comes with an attached gag order that says like you have to hand over information about your patrons or your customers or whomever, and you can't tell anyone that you've got it. Libraries in 2005 received one of these. We don't have very good data about how many other libraries may have gotten them because they have gag orders attached to them. It's a very scary thing for libraries to have to contend with this in a post-911 world. Even scarier is that if we rely on all these third party vendors, we will never see that notice. We won't even be -- >> Carson Block: Right. >> Alison Macrina: -- able to talk to our attorneys or the ACLU or whomever because it goes directly to Adobe or OverDrive or Elsevier or whatever. And so we've in some ways, you know, our charge of like protecting privacy and protecting intellectual freedom and all the amazing things that we've done in the interest of that where we're like we purge records and we, you know, we fight back against unlawful information requests. It's been taken out of our hands. And that it is a -- that's a scary new world. >> Carson Block: It especially is because I know that in the times when I was in -- working, when I actually had gainful employment working in a library, I was in a city library. I would get a call from -- usually it would be a casual call from the police department, a department that's part of our stable, right? We are colleagues within the organization. It would be a casual call saying, so, we want to know if so and so was at the library at a certain time. >> Alison Macrina: Yeah. >> Carson Block: And I would say, so, you know that legally I cannot give you that information. >> Alison Macrina: They always do it so casually, too. They're like -- >> Carson Block: Yeah. >> Alison Macrina: -- we're friends. We're all in the same community. Don't worry about the warrant provision. Like, you know, we're buddies. >> Carson Block: And at least at that level they were asking, right? >> Alison Macrina: Yeah. >> Carson Block: They would ask for that. And actually my -- like with law enforcement, we always had great conversations because what I would say is, you know, I can't give you anything without a subpoena. However, we don't like people breaking the law. So let's have a conversation about this. What is actually prosecutable evidence in this case? >> Alison Macrina: Yeah, yeah. >> Carson Block: And a lot of times the officers I would talk to, they would say, well, you know, if we can see somebody doing something that's better than anything. That's like, why don't you just visit the library then if you think someone's breaking the law. We'd love to see you. And we don't want lawbreaking to happen within our building. We don't want people to get hurt, you know. So that's a different sort of conversation than, hey, we just got the honeypot from Google. >> Alison Macrina: Right, yeah. >> Carson Block: We don't even need to talk to you about this. >> Alison Macrina: Don't even need to talk to you, yeah. Totally. Yeah. >> John Resig: Yeah. There's also the -- another tricky issue in here which is the rights over the material itself by the people who originally created it, or were involved in its creation. Like there was this blockbuster I just read recently about a library that had digitized a lot of really like old zines and stuff. So from pre-Internet era. >> Carson Block: How nice. >> John Resig: And so when they were created there was never an expectation that they would be digitized and distributed all over the world. You know, that they were inherently -- there were going to be just these couple dozen copies and they would go to friends. And so the problem then becomes is that some of these contained material that mentioned very specific people, and if they were connected back to those people who were still very much alive, that it would -- you know, it harmed their lives. >> Carson Block: Sure, yeah. >> John Resig: So like, so this kind of ties into the general like right to be forgotten where like do those people have the right -- then request through the library or whatever institution had digitized it, that this material be redacted or removed. And this is something that I'm still mentally trying to grapple with. I'm not sure I completely understand it or can appreciate it. And I think one of the things that's challenging though -- but one of the points about this I think I did understand was that libraries have that need to have very clear policies around this, about like if you -- if there is material that we have digitized and have made accessible, that does involve you, do you have the ability to request that they be removed? >> Alison Macrina: Yeah. >> John Resig: And if so, what is the ways in which that can happen? >> Alison Macrina: It goes back to the point that you were making earlier about like, you know, what the power differential there is, you know. Because I believe very strongly in transparency for powerful people and entities like governments and corporations. But privacy for individuals and the right to be forgotten is so tricky because you don't want to open up the opportunity for, you know, powerful entities to be able to say, well now I want to censor this stuff. But also some rights are in conflict. I see the right to be forgotten is a kind of extension of the Fifth Amendment because if, you know, I want to remove information about myself because, you know, for -- I might not want to solve, incriminate, or just because that should belong to me in some way. But it's not an easy thing to figure out. And then at an institutional level, how do you, like, write a policy for that? You know, how do you even -- >> Carson Block: How do you understand it? >> Alison Macrina: Yeah. >> Carson Block: Right? >> Alison Macrina: Totally. >> Carson Block: And on the other side of things we've got this content that is great for mixing and mashing up. And I just experienced that. It was so cool. I'm working with Boulder Public Library and we had a focus group for part of the Carnegie Library for Local History. And this guy calls me up later. He says, "I want you to see this website I created called hereminus100, so it's -- what he did is he took all these different mashups of vintage historical photographs. For instance, he had a lot of mashups, but this one was really kind of funny. He had a slider where he took his own modern photograph from the same vantage point and it was so incredibly cool. He didn't ask anybody if this was okay. He just said, "Ah, what do you think? Do you think they'll get mad?" That was his question to me. He goes, "Do you think they'll get mad?" And I said, "Well, let's look at the rights -- >> Alison Macrina: Yeah. >> -- "on the photograph number one, because that's the bottom line. Mad's really got nothing to do with it." >> Alison Macrina: Right. >> Carson Block: "Let's see if you have the ability to use this. And I think it's awesome" -- >> Alison Macrina: Mm-hmm [laughter]. >> Carson Block: -- "at the same time." But I keep thinking that that's one type. We're also creating all sorts of new digital content that I think will be ported into all sorts of things like video game characters or situations or motifs, for instance. Are you seeing anything that you think is awesome when it comes to the sorts of things that are being collected by cultural heritage institutions now, and thinking that could be used as dot, dot, dot? So do you think that peaks your interest there? >> Alison Macrina: Hmm. That's a good question. >> Carson Block: Well, we'll keep looking around -- >> Alison Macrina: Yeah [laughter]. >> Carson Block: -- while we're here. That was not on the script. >> Alison Macrina: I mean it's making me think about copyright issues more broadly I think. And it made my brain go in a different direction. And just, you know, like thinking about not just the way that our public is consuming information and creating new things with it, and the implications of that for copyright, plus, you know, new copyright issues, especially with like trade agreements and all that sort of stuff that like would create really draconian new rules around like DRM and other licensing and stuff. And then also thinking about like again, you know, in terms of like what the public is kind of interested in and what they might want from us as cultural institutions. Looking at the popularity of Sci-Hub and LibGen, which are basically like, you know, pirated academic papers for free online as they should be, totally free. It's totally illegal else we are suing them. But it's like in the Arron Swartz like direct action kind of vein of like let's put this all online because people need it and people are benefitting from it, and they love it. And everybody who, you know -- I mean, those of us in academic institutions or anybody who has to deal with any kind of like, you know, pay wall or per use subscription understands why these things are so popular. So like what does that mean for us in terms of, you know, the rights of the copyright holder but what we need to provide as a public service? I don't really have an answer. I'm just sort of thinking about it. >> John Resig: I think one of the things that's interesting is that there are two aspects of this. One is legally who has the right to do activities with a certain thing? And that can be usually defined with the help of a lawyer. And I will call out the, you know, the work that a near public library did. And they're -- discussion this afternoon on it as well, if I remember [inaudible]. This afternoon or next afternoon. I don't remember. But -- this afternoon, okay. And -- but the work that they did to methodically go through a lot of their materials and digitize materials and figure out what was or was not in the public domain. And like that work is so, so hard and so few institutions actually do that because it is, one, it's time consuming, and that time is usually lawyer time which is very expensive. And, so, like -- and -- but the thing is, is that a lot of institutions don't do this and instead they hedge their bets with a lot of very vague language where they say, like, oh, this is public domain for academic use. And you're like, what is that even -- that doesn't -- is not a thing that exists, all right? You know, like either it's public domain or it's not public domain. But the -- >> Carson Block: Like a license. >> Alison Macrina: Yeah [laughter], right. >> John Resig: So the things is, is that there is -- but there's a tricky part of that here, and maybe this is sort of a thread of what we're going through here is, you know, having the right to put something online is not the same -- like you could be legally totally in the clear. However, you have to think about who's going to get upset. Who could this impact? And so like upsetting Elsevier is a completely different thing from upsetting, you know, someone whose personal information you put on -- >> Alison Macrina: Totally. >> John Resig: -- because it was published back in like 1970 in a zine or something. Like there's are things that kind of have to be mentally separating. You have to think about who, you know, who is being hurt by this or could be hurt or could be upset. Yeah. >> Carson Block: Absolutely. Well, we have about 10 more minutes left. >> Alison Macrina: Oh, wow. That went by so fast. >> Carson Block: And doesn't it go -- it just goes by so quick. And we know that we covered a lot of stuff. We kind of wanted to have a smorgasbord of different topics that we thought were interesting in tech trends in libraries and cultural heritage institutions. But we would love this to have your voice in it, too. Is there any comments that you have? Any questions that you have for us? And I think we've got a roving microphone around here somewhere. But we'd love to hear your comments. We'd especially love to hear things that you're struggling with. It's good to hear. Any questions or comments? ^M00:33:07 ^M00:33:10 Lunch was good. Oh, yes. We've got a gentleman here. Let's see. I think they're coming over to you with the microphone. >> Okay. >> John Resig: Oh, down the other side. >> Carson Block: Oh, here we go. Take your choice. You can be stereo [laughter]. Speak in both of them I think. >> Alison Macrina: Use them both. >> Okay. Two things I didn't think I heard. I'd like to get your opinions on. One of them is RDF and link data. And the other's deep learning. >> Alison Macrina: Wait. You -- >> Carson Block: I'm sorry. Could you repeat -- >> Alison Macrina: -- speak into the microphone. I -- >> Carson Block: Yeah. Hold on. We'll -- >> Hold it up. Sorry, sorry. >> Carson Block: Thank you. >> So two technologies I didn't hear about which I'd like to get your take on -- RDF and linked data and deep learning. The relevance of both of those or not to the library. >> John Resig: Okay. Yeah. I guess I have opinions on these. So I've been doing work with RDF and linked data. So, sort of the promise of this technology is that if everyone has their information in a consistent format, talked about in the same way, that when you say this is a thing called a book and it was written by a thing called an author, and that author is described by these names and et cetera, et cetera, that you will be able to, you know, show like for example all the books that were written by a particular author or were published by a certain publisher or et cetera, et cetera. And all of this extends beyond books. You can do this for art; you can do this for all sorts of things. I feel like the -- one of the things that's really tricky with the particular thing, though, is that it is -- it's extremely idealistic in that -- if you look at the absolute best case scenario, where all the data is perfect and everyone has put a lot of time and effort into making it perfect, that best case scenario is amazing. Like you can do so much stuff and explore so many relationships. However, the reality is that many institutions do not have the ability, either technical or logistical, to make it perfect. And so what you end up having are a lot of institutions are somewhere down here who are just trying to get basic information into their records, and so that promise is not fully realized. So I think this is one thing that is going to be a major struggle. Whereas like big institutions like the British Museum, like they have a ton of like really good linked data because that's because they're the British Museum. Whereas like a small, you know, a small library or something is not going to have that ability. They don't have the staff to make it happen. So I think that's one where you kind of -- I feel like you have to kind of take it with a, you know, a grain of salt. Where it's like, yes, in a perfect world that could be amazing. But think about your specific use case and think about, okay, what are the questions we're trying to ask? How can we answer them? And then, how can we do that with that sort of -- in that limited little area? I guess the other question you had was about deep learning. So, specifically, deep learning is a type of machine learning, and it's been popularized recently via Google and other organizations doing deep learning for -- like detecting objects in images, using it for like their self-driving cars, and stuff like that. And so one of the things -- like I've been experimenting with different deep learning algorithms to find and annotate images to improve their searchability, for example. And this is interesting. One of the tricky things is that you have to be aware of what you're -- how comfortable you are with wrong answers, okay? So like -- because the thing is that deep learning like really good algorithms will be right. Let's say you have an image with a -- there's a dog in it. Most algorithms will say, in its top five guesses, it will be right 60% of the time, okay? Or like just as an example. Whereas like that's like the -- okay -- >> Carson Block: Sixty percent. No, I was just -- I was in that top five -- >> John Resig: Yeah, yeah. So it will be like cat, chicken, dog. And it's like, okay, well, you got it mostly right. You know, there's a dog in there [laughter]. And -- but the, but the -- so the thing is, is that that -- >> Carson Block: I think we went to a fortune teller. They had the same success rate. >> John Resig: So like for example, I used this for the dataset recently of artworks created by artists, okay? But the problem is, is that you need a lot of training data to train it to kind of be like okay, here are a ton of cats and here are a ton or dogs or -- but in the case of paintings, you don't have that. You know, you only have a very small subset of data. And I think the problem is, is that yeah, it's -- the theoretical future, again, is going to be very cool. In the meantime, you have to be very willing to have like wrong answers be prevalent. And you saw this just recently like -- again, bring up Flickr again of all things. Like they did this for their images. Whenever you upload an image they automatically annotate it with tags. And then you start having really bad things happen when you have automatic annotation happen. Like one of the things was like it started to annotate people of color with like monkey and stuff like that, which is like terrible, terrible. And it's like it should not be happening. But that's because it -- how they trained the algorithm was completely misguided. >> Alison Macrina: Right. The engineers were like our algorithm is neutral. We showed it only white faces because that's what we think of as people. And then, of course, you have these outcomes that are awful and offensive and inaccurate. And people are like no, but the data like algorithms are all neutral. But people are not, you know. Math might be neutral, but like people can, you know -- they screwed up. >> John Resig: So, anyway [laughter]. As long as we're saying that I feel like there's still a lot of hurdles to overcome. >> Alison Macrina: Yeah. >> John Resig: And -- >> Alison Macrina: Well, look at how fast the Microsoft AI went Nazi. >> Carson Block: Oh, yeah. >> Alison Macrina: She went full Nazi 24 hours -- >> Carson Block: Bam, bam. >> Alison Macrina: I mean, if you can't design your bot to be Fortran-resistant in 2016 -- ^M00:39:23 [ Laughter ] ^M00:39:25 -- what the hell are you doing? They were just like repeat after me. And she did. I mean, of course, you know. And so one thing I'm very interested in is a burgeoning movement of ethics in AI and machine learning and bots and all that fun stuff, you know? Like how do you train a machine to be cool? >> Carson Block: Yeah. How do we raise our digital children? >> Alison Macrina: Yeah, exactly, exactly, exactly. >> Carson Block: Yeah, yeah. And I love the standards question because standards I think at best are only referenced by the world, right? Like in a closed system we can maintain them. And I'm saying this because I work on a little -- a lot of building project. And let me tell you something about how standards and specifications, which are hard in a building, are just kind of referenced sometimes in that old construction process. So that is a super challenge there, especially with the churn in technology, right? Things are changing all the time. We're realizing we haven't covered everything, and how do we cover it? Do we have any other questions? Is food digesting well after lunch? >> John Resig: I know this is the [inaudible]. >> Alison Macrina: There's one more. >> Carson Block: Excellent. Oh, good. Thank you. Thanks. >> I would just like to hear more about what trends you're excited about or any projects you've heard that are really interesting. >> Carson Block: Hmph. >> Alison Macrina: One trend that I'm really excited about is something that like generally I call secure defaults which are that we -- the Internet, generally speaking, is a very hostile place. It is not secure or private by default almost ever. The whole thing is written on postcards. That means, you know, exactly what you think it means, that anyone can see the metadata and the content of what you're doing depending on the kind of connections you're making. So what's been happening now is there's a push to make things secure by default. And one project that I'm so excited about, especially the implications that it has for libraries and cultural heritage institutions, is something called Let's Encrypt. Let's Encrypt is a project with -- it's being worked on by the Electronic Frontier Foundation. If you don't know them, they're sort of like the ACLU but just for the Internet. Electronic Frontier Foundation, Mozilla, Akamai, a few other really big Internet Service Institutes, you know, public and private. And basically what Let's Encrypt is doing is making it really easy for anyone who runs a webserver, that is to say a website, to set up strong https encryption on the whole site by default forever. So free automated renewals every year. This is a really important thing for libraries especially given our charge for stewardship of our patrons' privacy. If our library websites are not encrypted, if they use just regular http to connect, that means that when your patrons are going on your site and looking up books about divorce or diabetes or gender identity or any other sensitive topics, that is going out over the wire up to the rest of the Internet in plain text. Anyone can see it. They can see the originating location. They can see that it's coming from your library. And if you've got an encrypted login page, like if you encrypt where they enter their library card, that is actually worthless. And even worse is that it can create a cookie that then can be intercepted by someone, so they can steal that login information and then also still see everything that's being done after the secure login. So Let's Encrypt is a project to make it really simple for all library websites to use encryption. I'm really happy that DPLA has prioritized this actually. Mark has been working on this. They're not using Let's Encrypt but they have -- they're about to like fully deploy https for their site, and they ran into some mixed content issues and other things. But they ended up fixing it and it took them just about a week, I think, to get it all together. If your library's interested in Let's Encrypt, and you don't really know what you're doing, you can talk too Library Freedom Project. We have a privacy pledge that we've gotten a bunch of people to sign onto. So that's probably the biggest thing. And the sort of the most important and like something that any library should do. And I'm encouraged to see how many have done it so far. >> John Resig: I think very tangentially, one of the technologies that I'm most excited about, and I'll second Let's Encrypt. I've been using it myself and I think it's fantastic. >> Alison Macrina: Awesome. >> John Resig: and it is -- I've been using different computer vision algorithms, but there's one particular type of algorithm which is the kind that you see in like Google Image Search or in -- people familiar with TinEye, where you can like upload an image and it will find images that look like that image. Now I feel like this is a particular problem that makes for a fantastic finding aid because the thing is, is that when it comes to images if you don't have, you know, amazing metadata to go with it, it can be hard to locate them. However, if you have -- computers are really good at, you know, seeing that these two images are very, very similar to each other. So there are different commercial algorithms out there. However, there's one open source, one I've been using just recently called Pastec, P-A-S-T-E-C. And it's written by this French developer. And I and some other people in the digital humanities world have been contributing to this particular algorithm because, one, it's the only open source one that we know of, and, two, that it seems to actually be pretty good. So, like for example, like I've been using it for some of my projects to help scholars locate our artworks. But like the Brooklyn Museum, for example, just last week, they took this and implemented a piece of software so that people, when they're going around the museum, can photograph artworks and it will find all the information about the artwork, and just by looking at the photo that they took. So like it's really easy to implement. So this is one thing I feel like can have an immediate direct impact on people who are like trying to find information. And I think like that's like pretty much like one of the definitions of what a library is and what it provides. So I'm very excited about that. >> Carson Block: That's excellent. We're almost out -- we are over time, but I'll just mention the thing that I'm like the most excited about because I'm a musician and that is the zero latency real time collaboration technology which I forgot the name of because I just learned about it last week. But I'll post it online. This is really important for musicians who are geographically dispersed to be able to play together in real time. There's this thing that happens for musicians that's really weird, and that -- we hear each other and we listen to each other, but when things are really tight it's because we're anticipating what the other person is going to do. And that's what makes an awesome, awesome musical performance. And so I'm extremely excited by that. Now you need a gig connection for your session, not for your library, but for your session. But I just go involved with that group and I'm psyched about that because that fulfills this promise of actual interaction using these wires and radio waves. >> Alison Macrina: That's really all they're good for [laughter]. >> Carson Block: That's right, exactly. Amen, sister. Thank you very much for your time today. Thanks for hanging out with us today. Have a wonderful rest of the conference. ^M00:46:33 [ Applause ] ^M00:46:35 >> Alison Macrina: Thanks, guys. ^M00:46:37 [ Inaudible Conversations ] ^M00:46:43 >> This has been a presentation of the Library of Congress. Visit us at loc.gov. ^E00:46:53