Video: The Modern Data Dilemma: Managing Mobile, Ephemeral, and Cloud-Based Evidence | Duration: 3904s | Summary: The Modern Data Dilemma: Managing Mobile, Ephemeral, and Cloud-Based Evidence | Chapters: Introducing Modern Data Challenges (38.085003s), Mobile Discovery Challenges (216.585s), Mobile Forensic Challenges (344.47s), Mobile Data Review (750.68s), Mobile Data Review (1020.94495s), Ephemeral Messaging Challenges (1374.41s), Ephemeral Messaging Challenges (1475.585s), Handling Ephemeral Messages (1818.885s), Hyperlink File Challenges (2299.015s), Linked File Strategies (3395.605s), Linked Document Management (3689.24s)
Transcript for "The Modern Data Dilemma: Managing Mobile, Ephemeral, and Cloud-Based Evidence":
Everybody, welcome to the Everlaw webinar on the Modern Data Dilemma, Managing Mobile, Ephemeral, and Cloud Based Evidence. Over the next hour, we'll discuss the challenges for legal and d discovery professionals associated with today's modern forms of data and how to apply best practices and technology to address those challenges. Before we get started, let's, introduce our panelists for today's webinar. Steve Davis is a licensed private investigator and vice president of forensics and investigations for Purpose Legal with a distinguished career working in the legal service and digital forensic industries. His strong investigative background includes managing hundreds of civil and criminal investigations on a national and international scale. He focuses on investigation management, computer forensics, and litigation support. Welcome, Steve. Thank you, Doug. Nathan Sawatzky is a senior solutions architect of Everlaw. He spent the last seven years at Everlaw consulting with clients about eDiscovery workflows, functionality, and how best to maximize the value they get from their tools. Welcome, Nathan. Thank you, Doug. And I'm Doug Austin, editor of eDiscovery Today, a daily vlog about eDiscovery, cybersecurity, data privacy, information governance, and AI trends, best practices, and case law. I've been providing consulting and project management services to clients in the legal space for over thirty years, so I know a thing or two because I've seen a thing or two. And I've been daily vlogging for nearly fifteen years, now, so I'm either committed or I ought to be. Before we get started, I should remind you that any ideas expressed by us here today are our own, not those of our organizations, clients, or partners. That's true regardless of which modern data format the disclaimer is located in. So let's get started with the first of our challenging modern data formats, mobile data and short form communication. And I'll set the stage by discussing that first question you see there. Why is mobile data becoming increasingly critical in eDiscovery? And the answer honestly seems pretty obvious. Mobile devices are everywhere these days, and many of us, of us are using them on and off all day. And here are a couple of stats to back that up. There are 7,100,000,000 smartphone users in the world, which means that about 85% of the global population owns a smartphone. And on average, people in The US spend about four hours thirty nine minutes daily on their mobile phones, and that excludes time spent on voice calls, which was the original perks purpose of the phones in the first place. We're not only using the devices for personal use, but they'd be, increasingly become ubiquitous in work use as well, especially in terms of communications with work colleagues via text and other messaging apps. However, discovery of data from mobile devices can be challenging, so much so that parties, in many cases, have agreed to exclude mobile devices from the scope of discovery simply because they felt the burden of obtaining the, evidence outweighed any potential benefit of that evidence. That's especially true in corporations where just 2520.5% of corporate respondents in eDiscovery Today 2025 state of the industry report said they have mobile device data in their cases all or most of the time. As a result, we've seen a rise in case law rulings involving exfoliation of mobile device data as well as disputes over whether data from employee BYOD devices is in the possession, custody, and or control of their employers. So let's talk about why discovery of mobile device data is so difficult. So, Steve, I'll turn to you on this one. So what are the biggest technical challenges in collecting mobile device data? Yeah. And I'll piggyback on your your comments, Doug, a little bit. I think, first of all, if you go back twenty, twenty five years as an investigator, you know, we were taught first thing to look at was kinda corporate email. We ran in there and we grabbed the corporate email and bring it back home and take a look at it. And and, obviously, people that are, doing things that are nefarious or bad intent, sometimes use their corporate email. But, really, we've seen a movement away because that's oftentimes monitored by the sysadmin or by the organization or enterprise. So, really, phones, people have this notion when they have the phone in their grubby little hands that they're able to go ahead and access and do things and search things and send messages, and there's re really no visibility to it, and that's obviously false. So we've seen, first of all, a transition away from just the route go to the m three sixty five or the the Gmail, and now we're seeing a lot more communications occurring that happen on mobile devices and tablets and the like, primarily because of the uses of apps, so third party apps that exist there. From a technological standpoint, and kind of the first question, this really has a, in large part due to the fact that we're not dealing with a normalized linear operating system with loose files sitting on top of it. When we do a bit by bit forensic image in order to collect data and then be able to mathematically defend what we've done. We're able to see what existed beforehand, do the operation of the bit by bit image, and then test it on the back end to ensure that mathematically it matches and that there's no interruption to the process, and then the process leads to a sound defensible collection that we can point back to later for purposes of authenticity or foliation. Nowadays, when we're dealing with mobile devices, we're dealing with structured data. So think SAP or Oracle or SQL databases. And and the the collection vehicles and forensic tools that are out there, people talk about imaging a phone, but you really don't image a phone. What you do is you extract out the data or parse it and normalize it. And so we're going into tables and partitions and databases and extracting out data that may be transient by nature, maybe much like accounting, pushing information out. Like, accounting would be FIFO and LIFO, first in, first out, same concept. We have three daughters, my wife and I, and the way that they serially use phones, they would move data off of their devices much more rapidly than someone like me. So there's a couple of complexities involved with access to information and then normalization so it can be utilized on a platform like Everlaw. Alright. So when we're talking about apps like iMessage, WeChat, and Discord, how do forensic tools hand handle communications from those apps? Yeah. Great great question. So my latest CLE that I'm doing, I get to talk about this stuff a lot, is called it depends. And and I always thought that was the worst answer in the world when I first broke into the business. Older people would tell me that I'd ask them a question. They'd say it depends. I'm like, that's a terrible answer. Like, tell me, like, tell me the real answer. Like, what's the truth? They're like, no. No. No. It depends. Unfortunately, I've become that person. So now it does depend. And and you mentioned three different items, and each one's dramatically different because of a couple concepts. One's the concept of residency, where data lives, and then one is called redundancy, where data synced or backed up to. So when the providers first started issuing phones, the data was resident on their collocation centers and their servers. Then they figured out that lawyers existed, and and they, you know, they'd call on the providers and they'd be like, hey. Gonna need all those call records. Gonna need voice mails. Gonna need content. And the providers were like, well, this is a terrible idea. Let's push all the data out to the endpoint. Push it to the device. So our phones went from being nominally sized into hundreds of gigs, even upwards of a terabyte, and data gets pushed out. But now you need to understand whether it's iOS with your iPhones or whether it's Android, two very distinct platforms and operating systems and literally hundreds, if not almost thousands, of different databases that can exist on phones, SQLite being one of them. We're dealing with with a a different amount of access and a different amount of residency depending on apps. Like, for a long time, WeChat, one of the ones you mentioned, we can get to that through Apple products because Apple developers are really good at their jobs, and they force people that sign up for being sponsored by Apple to be able to decrypt their data from the database that resides inside the SQLite area. We're able to decrypt that data and get full access, but that's not true necessarily of Android. So just a year or two ago, we had to use a product, this one called Acxiom, that went in an access, but it only got access to the last one year of WhatsApp data, which you can just imagine how exciting a lawyer would be about this information. Like, you can get it on Tuesday afternoons if it's not raining, but, you know, if it's another day of the week or it's sunny out, then we have a different condition. And that's that's not very comforting, and I think in large part is the reason why maybe courts and practitioners haven't adopted. There's some of these challenges and complexities more so than its utilization by principals or custodians in a matter. Yeah. Well so with all of those considerations and challenges, what are some best practices, for collecting and preserving mobile data defensively that you recommend to our audience? Yeah. I I think a number one is you've got to have a unique you know, we all talk about custodial interviews, custodial questionnaires that exist. We we've actually created one on Purpose Legal that is a custodian questionnaire specific to the mobile device environment, and it gets very granular about, like, how much space do you have on your phone, and can you load an app to it, and is there MDM involved, and what third party apps do you utilize, and and a whole bunch of questions that we ask, what I call crossing the bridge before you get to it. In other words, we can't afford to look back later and talk about what we should have asked or what we should have known. We've gotta ask these questions upfront. So we've gotta grab the tiger by the tail and deal with custodians who aren't very excited because many times these are personal devices, and then ask them a lot of questions that'll probe into the usage, what they do, how they save information, what apps they use, and that's gonna tell us a a ton about best practices. You know, should you use Celebrate, the 800 pound gorilla in the mobile, you know, forensics area, or do you use other products? And there's a slew of them out there that allow you to get the information. And once again, it's kind of a it depends on how well they access information and what you're dealing with because many times people don't want you to do an entire capture of a phone. And then a phone, from a forensic standpoint, once again, not an image, so rather, we're either doing a logical collection or we're doing something called a physical collection or full file system extract, which allows us to get to the maximum amount of data that's available. That would be ideal in every case, but it can also take six, eight, ten, twelve hours to do. And so you have this push and pull with custodians and practitioners that wanna believe things you know, I wanna get to New York in two minutes. Right? Everyone does, but you just can't. And so you the realities of juggling expectations, defensibility, thoroughness, and completeness, all those things are balls that are juggled up in the air, and we just gotta kinda work through and have the end users and the lawyers understand the ramifications of decisions along the way on whether you're gonna use a certain tool and then what methodology extraction you're gonna use. Yeah. So you're you're being educator as much as consultants and and, and forensic, analyst here. So, yeah, makes perfect sense. Yeah. So that takes care of the left side of the EDRM in discovering mobile data. So let's turn to Nathan to talk about the right side. So, Nathan, what makes reviewing mobile data different from reviewing traditional documents? Yeah. Great question. There's a whole set of, challenges that come with how you handle them in review and and dealing more with the right side of things. So some data types coming off of mobile devices are just gonna be the same as you'd see in other sources, things like pictures and video files and even chat data. But some are different, unique, like text messages, things like even call logs and and location logs and other, formats that that can be extracted from cell phones. I mean, that being said, often the the text messages and the chat are are what are most common to pull off of those. But, just the fact that they can be collected, in a variety by a variety of different tools means that, they might be coming out in different formats. And so that in itself can can pose challenges, if they're not collected in a format that is, supported by your particular review platform. So I would, just kind of going off of what Steve was saying, I think makes sense when you're doing those those early questionnaires to also be thinking about, like, okay. Well, what formats would we need in order to and thinking about what tools can we use, what formats would we need in order to have an effective review? So I think, there's a few things to keep in mind when planning and preparing for processing and review. And kind of two main things stand out. One is identifying what your reviewers need or would benefit from in terms of, review format, especially for short message, format. Like, are your reviewers okay with a text format or a spreadsheet format? Definitely seen a bunch of that. Are they gonna want a format that's very much like the original? Maybe it includes speech bubbles and emojis, and, messages threaded if it's if it's a type of data where where that, is how the data can be organized. These things can have a big impact on review effectiveness, not to mention the speed at which, reviewers can get through documents. And, so, yeah, this kind of one first question, I think, to keep in mind just what your reviewer is gonna need and and benefit from. Then the second that goes with it would be as much as possible, aim to match your collection and processing and review tools to that need. Right? So, there's kind of this whole, balance of factors there around, like, what tools can we use, like Steve was outlining, in order to get the data at all and within that, which are, going to most kind of serve the purposes of the particular review platform that we have now or if we have several, thinking through, like, what's gonna be the best output for our reviewers. Because all these things affect case outcomes. Right? So, I mean, if you're in a position where it's financially and logistically viable, I mean, ideally, you can use a high performance tool like Cellebrite and match that with, processing and review tool that, that can handle that output. And, but, of course, not every situation is ideal. Sure. Absolutely. And you mentioned Cellebrite. Steve called it the 800 pound gorilla. Obviously, it's a collection tool. So, how does Everlaw process Cellebrite exports, and short message data from those? Yeah. Great question. So, yeah, Everlaw has the ability to handle, if somebody used a UFED, like, the the Cellebrite tools to collect from a modern device, it'll output a variety of different formats, but if you've got the UFDR file from that, that's easiest to just drag that in and Everlaw will take it from there. So, it's it's meant to be a really straightforward way of handling that data where you don't have to go through and and parse that for you. It'll pull out your short message data. It'll pull out your call logs and and things like that, and, in in a way that and and then represent that especially the short message format data in a format that is bare meant to be very similar to the original to give reviewers, a good experience around that. And, you know, context of messages and handling things like emojis can be a challenge. So how does Everlaw maintain message order, context, and representing emojis? Yeah. Great question. So, those are things that, are part of the automated processing that happens when you drag that data in. It'll, give you an output that, that's that handles emojis, that that keeps messages in order. Maybe the easiest way for me to to answer that question though is, I can share my screen and just show you what something like this would look like, in MLM. Let me go ahead and do that. Alright. So I think I'll just need for the slides to be unshared, and then I can share my screen. I'll stop now. I'll stop sharing the slides there, see if that will help. Great. Thank you. There we go. So, I think you can all see my my screen here now. So I've got just a set of documents pulled up here, just to give you some examples of, what this can look like. When things are going right, you've got a really thorough collection tool, Accelebrite, and then a tool that can handle those outputs and what a difference that can make compared to, say, reviewing in a in a spreadsheet format or text format. So here, for example, we've got an iMessage that was pulled out. You can see it's got the speech bubbles. It looks very similar to the original. You can see at the top, there's information around, like, who are the participants in this conversation and and the date at which, and times at which messages were sent, but also their support for emojis, reactions. Those are also searchable. So, often in a short in, mobile data, emojis are are used a lot. Right? So it can be really helpful to search across your dataset and be able to find documents that have certain emojis. And, you'll notice too that over on the left side, you can see, like, in this case, we've got, a a GIF in here, where you can see it as an attachment here, and you can go see what that original the content of that original GIF was just right in the context. You don't have to spend the time to maybe download it and and and view it offline or something like that. So meant to be just, an experience that's very similar to the original, very easy for reviewers to understand that really complex context that comes with the short message format and, so key to understanding, communication. Here's an example of a text message, from, yeah, from an Android device. This is a WhatsApp conversation. So you can see across a variety of different types, how easy it is to to review when when things are kind of lining up really nicely, in terms of the collection and review tools. There's also just some examples of, like, here's, somebody's contacts. You can get a variety of different logs out of celebrate as well. So they're just meant to be kind of illustrative of the variety of things that are possible. There's a location log, things that may be particularly, useful for certain types of cases and, in terms of tracking someone's activity potentially. But, there's really a lot of powerful things that that are possible. Alright. I will, actually, I think one more thing I will show, which is, in terms of short message format, I think we mentioned about being able to keep messages together in, that are part of a thread instead of doing them just in chronological order. So this is the representation of kind of Slack message. And, again, you've got the emojis and reactions, but you can see that the the thread here there's a thread at the bottom of the first page here that is just all kept together in a format that's very much like it would be in, Slack originally and makes it easy for the reviewer to always keep the context in line, as as it was for those who wrote it. Alright. I'll stop sharing my screen, but this is just meant to, illustrate, how it can make a big difference for reviewers to to be able to go through and, not just on the collection side, which is often the key the key things. Right? Like, whether or not you're getting the data at all, but also just for the reviewers, like, how can we actually find stuff in here and move through it effectively? Makes perfect sense. Thank you, Nathan. Very, very, informative. Alright. So now it's time for the case of the missing messages. And while that sounds like the title of a cheesy nineteen forties film noir, it's a real challenge for legal and e discovery professionals. Ephemeral messaging apps like Snapchat, Signal, Wickr, and in China, WeChat, have become popular because they automatically erase the messages between parties immediately or after a short amount of time. And that's great for people who don't wanna leave a digital trail of their conversations, but it stinks for legal and e discovery professionals who wanna preserve, collect, analyze, review, and produce this evidence. Beginning to use an ephemeral messaging app after there's a duty to preserve evidence is almost guaranteed to result in significant sanctions for intent to deprive the the opposing party of that evidence, and I've covered a couple of cases involving WeChat and signal where that happens. If your organization is using them before there's a duty to preserve, it will likely have to consider some changes to its policies once that duty begins. So that should be considered before using them for work communications. It's also important to keep in mind that any messaging app can be ephemeral if an auto delete mechanism is set up for them, and that includes text messages. I've covered several cases where failing to turn off auto deletion of text messages led to sanctions for spoliation of that evidence. So, obviously, you can't discover what no longer exists. So, Steve, I know that forensic experts are talented, but you're not magicians. So how do forensic experts like you retrieve and preserve ephemeral messages? Yeah. I think you kinda touched upon it, Doug. So the the notion of ephemeral, which basically means transient, it's gonna be there one minute, the next minute is not gonna be there. It doesn't work like Mission Impossible where it blows up in your hands after it's done with reading the message. These are really settings, and you made reference to that. I mean, you can kinda create an ephemeral platform through anything if you have a delete setting that allows it to be within thirty seconds or one minute or five minutes. If you go to my signal app on my phone because customers of mine wanted to communicate on it, so I downloaded and used it. I've got data from, you know, three, four, five years ago on my signal app. None of it's changed because I don't have it set for deletion. People that choose rapid deletion within seconds, minutes, or days, the data's gonna be gone. So you're always gonna be up against the challenge. Then you have this additional layer of complexity in that we have many people bringing their device to work and or companies not paying for and or controlling devices that are utilized to exchange business related information. So in those cases, that now you have this issue of can you even control individuals' behaviors, and are they necessarily nefarious, or are they maybe concerned about privacy and security? And I've been at no shortage of, like, ediscovery conferences and shows where I've heard some judges speak on this issue and say that the very notion of utilizing an ephemeral messaging platform is inherently, I don't wanna say evil, but can be, looked upon or frowned upon by the tribunal. I don't know that's really a fair position, and I was talking to one of the prominent people that is on the Sedona conference, and they they she she's an attorney and takes a very, very different, position on that. I personally think security is important, and I don't want to have people have access to everything I've ever said to anyone, particularly as it relates to my wife or three daughters. So, you know, it it's a very, very slippery slope. It's probably something we could spend a whole hour chatting about. As far as challenges, if the data ain't there, the data ain't there. Now having said that, can you get to data? Sure. If you're a sleuther and you have some other ideas, like, I would think about backups and syncs that may have occurred prior to the time that auto delete was enforced. So whether it's a minute or a day or a week later, it's possible that people sync to the cloud or to a tablet or a computer and backup information that will allow you to go to that compartmentalized encrypted backup and access information that sub subsequently gets deleted through the delete function, and that doesn't sync like it would on your network where all the data you deleted from my phone when I hook it up to my network, it's gonna take care of both places and make sure the data's synonymous. That's not true of a backup file that's encrypted and or encapsulized. So that that is one way to get there, but, yeah, you're right. We're not magicians, and and we can't necessarily recreate things that are deleted. And in the case of most ephemeral platforms, you can't mount the SQL database and then go there and find it. We we can sometimes, with native applications like iMessage, still go in and find the information in a table or a database, but not so much on the ephemeral side. So if you are under a legal hold, if you have people that participate in your business, then you need to have active, once again, crossing the bridge early, active communications with them about settings so that you don't have this information as seepage, and and deletion occurring in the middle of something very important like having to respond to, you know, production requests. Yeah. Absolutely. So you talked about backups. Are there any other defensible ways to prove a message existed if it has disappeared? Well, you you could always go to let's say it was a participant and and you and I and Nathan were all chatting amongst ourselves, and and I deleted all my data, but you all didn't. So there's a possibility in a tri party conversation that other devices may be housing information. It may not give us a % of the convo, but it would give us some insight into what was being discussed and when it was being discussed. So that's another technique. Okay. Alright. And, obviously, you touched on some of these already. But from your perspective, what other, legal and compliance challenges come with ephemeral messaging? Well, I think I think it's the, kind of the inherent guilt by association. You know, if you delete things, whether it's active participation deletion or auto delete or a setting you make on a program like Telegram or Discord or Signal, you know, why are you deleting your data? And so, like I said, some people would claim because you're a bad character, that's why you did it. Some on my side might say privacy and security reign. So it becomes a little bit of an argument. But clearly, from a legal standpoint, from an organization or enterprise standpoint, you know, we've gotta coach our employees and the people that we deal with that for our own protection so that we don't look like the bad character, that we don't enact those deletion techniques, or we don't utilize those platforms that, by their very essence, have built in deletion. Yeah. Shadow IT is a problem more than ever these days, and you go you always have to make sure your people aren't using platforms you're not supposed to be using. So Right. No doubt about it. Yeah. So, Nathan, let's talk about how Everlaw handles, how the short message structured short message formats. Can you talk about that for a bit? Yeah. So, there's really I guess, this is gonna come back to, kinda what Steve, had said before. It depends. So and it just depends a lot on, like, what type of formats, do you have. And, and so if you're coming in with a Cellebrite export, you know, that's clearly ideal. If you've got, Slack data coming out directly from Slack, so it's in a JSON format, that's great. You can load it in. I guess, I would say just about JSONs in general, though, often, the question to ask would be, great. We've got JSON data, but what type of JSON data is it? And is that something that's supported whether it's by Everlaw or or another platform? So just because you can have JSONs for Slack and and a differently type a different format of the JSONs for other data types. So that's kind of an important key consideration. But, yeah, if you've got RSMF data, you can pull that into Everlaw as well. But I I guess I'd say with ephemeral messaging in particular, sometimes it's coming from apps that are a little more, maybe I'd say, on the fringe of the business world. And, and so there just aren't as many tools for collecting it in a standard way. And so, I mean, at Everlaw, we're definitely always very interested in seeing what other formats are out there and, and building support for it often very quickly, if, if there's kind of a key need for it. But, yeah, I guess we already saw some of this what the outputs are when when data's coming in that is already supported, but, there's really kind of a variety of different ways that that you could get the data in and, and be able to review it. Yeah. A lot of these source apps never really think about the ediscovery considerations until downstream. They never built it in at the front with it from an ediscovery by design standpoint, so that leaves us with plenty to do. So, of course, so from your perspective, what are some best practices you can recommend to our audience for reviewing ephemeral messages efficiently? Yeah. So, yeah, I think that's a a good question. On the one hand, there is not necessarily gonna be a lot of difference between, the ephemeral data once it's collected and other other types of data. Right? It might just be a standard chat format or, and and so you would just follow general practices that you might with with any more standard, chat or short message data. That being said, I think there's a a chance that there's definitely cases where screenshots become important that a witness to a certain message was able to take a screenshot right before a message disappeared. And, and those can be key pieces of evidence. Right? It's not necessarily gonna be high volume, but can be can be really important. And so for those, I guess, I would just say it's important to, be aware or at least be be asking, you know, are there those types of screenshots coming in, images, and in those cases, to make sure to OCR them? You know, for a lot of it's just a few clicks if you need to OCR set of docs, and there's no cost involved with that. But, I mean, that can just be really key for making those searchable and making them really easy to review. Mhmm. Yeah. Yeah. You mentioned screenshots. I remember a case that was, I think, covered on forty eight hours criminal case where, the, the, the, the, I guess, the suspect's girlfriend, was not trusting of her boyfriend, so she took a screenshot when he was located in a particular odd area on Snapchat, because normally, that you that information wouldn't be retained, and it turned out to be where he was basically disposing of his victims. So, certainly, that, you know, that's the type of thing that, obviously, when a thermal messaging gets rid of the data, a screenshot turned out to be crucial on that case. So, so, Steve, I'll get, kinda go back to you for the last word on this topic. What else, if anything, do you recommend that legal teams do to prepare for, ephemeral messaging, any discovery? Yeah. I think just being knowledgeable about, you know, utilization by your employees and contractors. What you know, one of the first questions we ask, are you using only native, you know, communication exchange like iMessage, or are you utilizing third party apps? And if you're using third party apps, which ones are you using? And and when we start to drill into that, then it comes into this question of how easy are those to access and then how easy are those to normalize or parse so they look attractive inside a platform like Everlaw. I think just crossing the bridge early, talking with people about what you're up against, doing the due diligence on custodial questionnaires. You know, the the problem with things like screenshots, much like the problem with metadata, is those are items that can be modified and edited. Like, you can take things and put them into text editors, you can put them into graphic editors, you can change them. It's kinda defeats the whole world of forensics, which really should be a science and more mathematic by nature. And and when we image a computer, it absolutely is mathematic, and we can prove that up. Anytime you introduce some of these solutions, which are absolutely the correct answers to the questions you've asked today, it just becomes gray and not so black and white in terms of what you're up against. So then, you know, I always say when lawyers have facts on their side, they argue facts. When they have law on their side, they argue law. And when they have nothing on their side, they just argue. But nowadays, what they do is they argue about process. And anytime we introduce Gray into the area, they've got something to grab onto and argue. It's not right or wrong. They're advocates for what they do. And and it's important for us to make it much more black and white and and and binary in terms of decision making. And so anything we can do to tackle upfront our knowledge base about what people are utilizing and then best practices and using the latest and greatest tools that are out there. And those tools change. Like, one day API works just super with Facebook and the next day it doesn't. And so we're constantly having to educate and then pivot and then normalize data. Like Nathan mentioned, you know, you're dealing with JSONs, that's JavaScript, and they have different numbers of field. They represent characters differently, and we have to normalize that data so that practitioners see it just like Nathan says when they're picking up and looking at their phone in a textual communication or they're looking at a collaboration platform. You gotta make it appealing in in how they're used to seeing it. So I do think a lot of this is just preparedness going into it. Yeah. Certainly makes sense. Alright. So our third and final challenge is one that I covered a lot and one that we could spend the entire hour discussing. And and I know because I've been part of four different webinars dedicated just to this topic, and that's hyperlink files, which are also known as modern attachments, though many people feel they shouldn't be called modern attachments because they're different from a traditional attachment. So what's the issue here? I mean, we've always had hyperlinks to web websites, files, and other types of content within emails and other communications, but the, move to cloud based solutions like m three sixty five and g Suite for so many organizations has led to those organizations beginning to standardize, linking to files and cloud repositories like Google Drive, SharePoint, and OneDrive instead of embedding them within the email. This reduces redundancy of data, so it's great from an info web standpoint. But since there's no longer a snapshot of the file within the email, it stinks from the eDiscovery standpoint as that link file could have been modified or deleted, or the rights to it may have been revoked since the message was sent. We've seen a growing number of case law rulings regarding the proportionality of discovering hyperlink files, including Nichols v Newham in in 2021, two decisions in the in race step up case in 2023 and last year, and several decisions in the InRe UberTech's case last year and this year with most of the rulings finding that discovery of hyperlink files was not proportional, at least with the technology available at the time. So while 72.2% of 551 respondents to ediscovery today's 2025 state of the industry report survey said that hyperlink files should be treated as modern attachments, Whether parties can do so has become a big can of worms. So, Steve, why is collecting hyperlink files so darn difficult? Yeah. There's so much to unpack. If I had a gin and tonic or a beer, we could talk four to six hours on this topic, but but we we don't, so I guess we're gonna keep it briefer. I mean, first of all, I think there's an oversimplification of the terminology. I'm I'm on the bandwagon of attachments being displaced by links are still attachments. Okay? But that doesn't mean everything that's a hyperlink is an attachment because you could hyperlink to the Library of Congress or to an entire university's library or or repository. There's a lot of things. The data, like you said, could be dynamic, could be changing. It could be accessibility. It could be broken over time and dynamic. So there's so many other things here. We have links inside the links. We have multilayers where you're drilling down into a spreadsheet, and it's got 15 different links. I mean, once again, the complexities around that are dramatic, and I think those are the things that are used by a lot of people that argue the case that you shouldn't have to produce because, oh my god, not we didn't even mention versioning. Right? So information that's been sent back in 2022 is the document you get when you do a forensic collection, in fact, the version that existed that date or if it's a collaborated document, has it been edited and modified? And depending on the tool you collect it with, are you actually getting the latest and greatest that lives inside Google Workspace or OneDrive or SharePoint? And the answer is Nathan loves and I love, it depends. Right? So you need to understand just like you need to understand ephemeral data and how it lives, you need to understand the risk and the rewards and the cost benefit to collecting data with versioning. And can you go ahead and time from a defensibility or a testimony standpoint that this was in fact that data, or is it just a spreadsheet that happens to exist that's now in a new format? And I can't really warrant as a testifier that it was that item on December twenty fourth of nineteen ninety one or 02/2015 or whatever date we're dealing with. So it it is complex, but if you know why we've moved toward this, we've moved toward this and I think you mentioned, in large part to kinda reduce the footprint. Like, we had big organizations, big enterprises where you had documents floating around. They figured, well, we're behind the firewall. Why not just point to the location and everyone can access the same place? Same concept for collaboration, have everyone edit it where we're not sending it out and then having to rekey it in. So I do think that you've got to I know this sounds like my it depends comment, but you've got to cross this bridge early. You've got to during your meet and confer. You've got to deal with how are we going to tackle the animal of modern attachments. And do you do it on a null hypothesis basis where you come back after the fact and pick the three or five or 20 documents that are super inflammatory and then drill down into any type of versioning issue there? Or do you do it upfront where you pump up the fatted cap by going from 30 gigs into a terabyte in someone's mailbox by grabbing all the versioning? And then that begs the question of how things version. Like, Google creates versions every second or two. And and granted, they're iterative, so they're not like the totality of the document. But if it represents that document and exported it out, it would give you the whole document with whatever change existed. So you're dealing with a lot of different moving parts. I don't know that it's something you can just say, it's absolutely this way or it's absolutely that way. I think you need to be nuanced, and you need to be communicative, and you need to be aware of the sophistication of your opponent to be able to talk through these issues and have an agreement. And I think that's the bigger thing is technology's probably going to catch up and give us better answers in terms of getting documents that existed, the date of the transmission, but were not there yet. And so then we just need to be self effacing and honest with our opponent and talk through the challenges of what we're up against, making sure people have access to the information they need to make judgments in a case, both from a tactical standpoint and an exchange of information standpoint. And so I think those are kind I know it's kind of a very watered down answer because it's not gonna give you an absolute, but I think these are the concerns that people need to talk about. Yeah. Well, I mean, there are no absolutes with hyperlink files today. And, you know, you talked about the versioning and and the I think it's the NRA Uber text case from, ruling from last year where, judge Cisneros talked about, you know, how well, the, you know, the difficulty, of and the challenge of of of collect collecting these files, but the fact that they do go to who knew what when. And, she used the term that I like called calling them contemporaneous versions, which I think is certainly something that, I think we, you know, can all kinda maybe get on the same terminology from that standpoint. I like that term particularly. But, how do forensic teams determine which version of a document existed at the time of an email or chat message? Depends, but it depends on what tool you're using. Like, in in the Google environment, you know, there's a company, that has come out, FEC, the forensic email collector that puts Humpty Dumpty back together again, and it will give you the version. There's a little option to pick a box that you can get the version that existed as of the data transmission. Microsoft and some of the vendors that support Microsoft are moving towards that currently. They're not there yet, so every single large provider has created a script, where basically they go get the last document that exists inside SharePoint or OneDrive before the date, and they cross reference through ID like a parent or family ID, and they can say this document and some version of it existed as of the date of transmission. But if it was deleted along the way, that may be a minute, it may be a day, it may be two weeks before. And from a testimony standpoint, I'm not sure I wanna risk my career on saying it was the exact document that got transmitted. It may be the closest one we can identify, and I think that's the best argument I've heard from forensic practitioners. But I don't think you can prove up it was exactly the document that was sent or received that day. Yeah. Absolutely. So with all the challenges you're talking about here, what are the risks of missing linked content in discovery? I I think it depends on, you know, how your meet and confer went and what you agreed with the other side, the whole thing, a little joke about the argument and what lawyers wanna pick at later. I think if you deal with this upfront and you do the best listen to a very prominent litigator at Arkfeld in in Arizona a few weeks ago talk about this issue, and and what she said was, listen. We're doing the best we can. And and that may sound watered down again, but I think it's a good answer. Like, you know, if you're really trying to be frank and honest and divulge and disseminate the information that you have in in your side of the equation, and you're trying to get it to the other side, and you're doing the best you can do. I I have a tough time arguing that as the system catches up and vendors catch up and providers catch up to do it. But I think if you the the where these become problems when you don't talk about them originally? You know? If you deal with this after the fact, after the exchange information and then attempting to edit and put Humpty back together then is always a mistake. You know, trying to rebuild families, trying to take information that you didn't get out at the same time. So it is the more you can do the collections, whether it's like a a Jira or a Confluence or a Slack or a Teams or whether it's a true modern attachment, I think the more you deal with the issue at the inside of the case and you use accepted widely accepted tools that are out there, that's the best defensible position, but it it is not perfect currently. Sure. No. Not at all. Yeah. And certainly, case in point, the in ray stub up case where they agreed in an ESI protocol to produce, hyperlink files as modern attachments before they realized the difficulty of doing so. And the only thing that saved them was the ESI protocol had a provision to modify it with good cause, which ultimately they were able to prove to the court. So, absolutely, definitely gotta know your capabilities before before you sit down for that meeting to confer. So so, Nathan, I'll turn to you now. Let's talk about downstream and what challenges can be expected for processing and review. Yeah. So I think most of the discussion around hyperlink documents, whether it's case law or articles, webinars, you know, conference panels, and and so on, is very rightly focused on the preservation and collection side and production, since after all the main, issue at stake is whether documents are going to be exchanged and used in a case. But, but I think it's also useful to, consider challenges that arise for hyperlink docs in review. So I think that can also end up being impactful and and and are worth thinking through, just in terms of, like, how manageable is this data once we get it. So I think, one challenge I kind of have several challenges in mind. One challenge is just to get the those hyperlinked documents mapped to the linking document. So sometimes hyperlinked docs, they're, you know, they're called modern attachments as as we've said. And so in the review context, that could suggest that after collection or or receiving a production that, oh, they can just be treated like any other attachments, but that's not always the case. And I think one of the main one of the main differences is, you know, originally, there was maybe just one hyperlinked document, but you can have several pointing to it. And then how are you gonna map that to be part of several families? Are you gonna make several copies of it? So there are some differences there in terms of comparing to, email attachments, kind of traditional, families in that way. But I I guess I'd say if you don't, map them at all, then, that's not that's often not and it's very common situation, not necessarily very ideal situation. Like, if you've got documents in your corpus that were originally hyperlinked documents, they are treated, as loose documents if they're not linked, no discernible relationship to the linking document, then yeah. Maybe they were brought in because just their own contents are are relevant to a certain search or or or for some other reason. But if they don't have a relationship, that can provide challenges for searching if you if you're trying to group together documents in the same context. Call it families if you want, but, even if you don't call it a family, it can still be very useful to to know what the full context of the communication was. Right? So, and then so that searching and grouping on the review side, it can also present a challenge because even if you can you're looking at that email and you can see, oh, it's got it's got the link in there, but how can I go find it? Right? It can be very time consuming potentially to go find the hyperlink document or even to know if it's in your dataset at all. Like, did we even collect that, or did we receive it? So, yep, do we have it? So there's definitely challenges that come if you don't have if you don't have the mappings. And, yeah. So those are those are some of the challenges that just come with, getting them mapped in the first place. Then I guess I'd say another challenge comes with if you're going to map them, just how you get some metadata in place, that you need. So some tools, like Everlaw, have the ability to automatically map it if you're using, so Everlaw has some built in connectors that can, perform collection from sources like Microsoft three sixty five or Google Vault or things like that. If you're using those to pull in those hyperlink documents, it will automatically map those to the linking documents that are pointing to them. So there are a few tools out there, review review slash collection tools that can do that automatically. A second, way that those can be mapped is through, purpose built scripting. So which is, you know, a number of service providers, have some ediscovery teams and people teams have, or you may have folks who are who are doing it manually, like, just editing family values, which can be somewhat manageable if there's just a small number of documents. But even then, you need to presumably already know what the families are in order to to map those up. So so that's a challenge, I think, just like how do you how do you get that metadata in place? And then a third challenge is, if you are, able to bring them in and treat them as kind of standard families in a review environment, How, I guess, if if you're able to bring them in and have multiple copies of them, one for each of the families, if you got several documents pointing to one, then that just kind of raises some issues around dealing with extra duplicates, extra copies of documents. It it, in some ways breaks kind of an ediscovery paradigm that you got one source document, should correspond to, one document ideally in review, instead of studying many copies that that don't correspond to actual different files, in the original sources. So, so I think that that presents some some challenges as well. And and then it kind of extends into what are you gonna produce? Are you gonna produce a single copy? Are you gonna produce one per family? And a lot of that's probably gonna come down to ESI protocol, but it's just gonna make a difference that that you've kind of thought through that ahead of time so you know what process you're gonna use to be able to do that and know that your tools that you have are able to handle that. Okay. You've already talked, I think, a little bit about this, but are there any other improvements or workarounds for handling linked files? And if so, what are they? Yeah. So, I think they're, kind of a couple come to mind. I think one of the main ones is, like I briefly mentioned, is to have hyperlinked documents linked as family members. So, you know, aiming to treat them similarly to in the review for purposes of review, treat them similar to, how you would email attachments, which you can do by having metadata values that that link them as part of the same family and allows you to, you know, search and group and, and review those in context. Although that comes with with some challenges if you do if you are trying to get if you have multiple documents pointing to the same document and somehow getting, the multiple copies in. But it has the advantage of gives reviewers a lot of context in terms of being able to see all the hyperlinked, documents related to a linking document. Just kind of look at them like you would attachments to a parent as it were. At the same time, there's, even if reviewers come across a hyperlinked document, it's not easy to see, like, what are all the documents that are pointing to this, that that are linking to it. So I find that to be, an interesting challenge. The one one of the strategies that Everlaw offers, that I think is is pretty unique. So I'll just kind of explain a little bit how it works as as another sort of alternative to this. Of course, you can also, in Everlaw, map the map them so that map hyperlinked documents so that they're treated as standard attachments. But, but Everlaw also has functionality built in to, that's purpose built, in the review platform that can map multiple linking documents to a single hyperlinked stomach. And so this in effect, create recreates the document relationships as they existed in the original, source, in the original location wherever it was, if it was in Microsoft or or wherever. And this can be advantageous for review. One, it doesn't create multiple copies of the document. Two, it, represents at least the way Everlaw does it. It represents these hyperlinked relationships as distinct from standard email, or or attachment family relationships. And then, three, it actually allows reviewers if they come across a hyperlinked document in review, they can actually go and see which documents are pointing to it. So there's a there's kind of a context for being able to, see I think we call them backlinks, to to be able to see which documents, are pointing to it. And that could be very helpful in review in order to see kind of the full context of who's been referencing this document and go back and explore those connections. So, yeah. So and and I can, I can show that in in a little bit to just kinda help us imagine how that works? But, but, yeah, it's meant to be kind of an alternative approach on the review side, that that gives reviewers more functionality and more visibility. Alright. I have one more question for you, but I'm gonna ask, Steve a question first because I think the last question I asked, you will kind of be a good lead in to the demo you're gonna do. So so, Steve, you know, one final takeaway, and then you talked talked on this a bit, but how can legal teams adjust their collection strategies to account for cloud based linked files? Yeah. I think, identification what we always do is we talk about a three legged stool when we start to think about doing a forensic collection. So I I always identify, like, what time frame are we talking about? Is this a week ago, a year ago, ten years ago? And some people, like, well, besides volume of data, why do you care? Well, I care because of migration from platforms. You may have been in a different email platform and moved to a new one. You may or may not have ported over all the data. Same thing is true of phones, like Tom Brady's phone. Right? He he went to a new phone for a deflategate, and then he claimed that he didn't have any of the information. Well, I always make the joke, no adult in America is gonna repopulate every contact they ever have. Right? Everyone is either gonna port together from an old phone they have or from a backup that exists already. So I don't know that, you're gonna have situations where data just falls off the end of the earth. I think you need to know a, time frame, b, who are the people, people of interest, what we call custodians, and then most importantly, what physical devices and data sources do they have. And when you get to this data sources issue, now you're able to identify, are we gonna run into an issue with potentially moderate attachments and this idea of having to pair things back together. And Nathan went into a lot of really strong technical explanation about how you can do it. One of my favorite words in the world, concatenation. Right? That's when you go together and use a commonality of a single field and two separate databases or spreadsheets so you can tie together information. That's really what we're kinda doing when we pair together a a family of information, including subsidiary attachments to a parent communication or document or presentation, because we're we're marrying together, concatenating back together that information. And so we need to know the predicate to that is knowing what source data do we have. If you have Google Workspace, ding ding ding, you're in business because there's a product out there that'll allow you to plug in upfront. And so now that I'm knowledgeable about that as a user, I I can use that tool to go ahead and apply. But that's good for Google, may not be good for another platform. So you need to know what's out there, what's available, and whether you hire them or you do it yourself, you need to get good recommendations, good knowledge extraction from experts in the industry. Yep. Yep. Couldn't agree more. So, Nathan, how does Everlaw display and manage linked documents? Yeah. So I'm I'm happy to go ahead and share my screen here, to just, give an illustration of couple of things that I was mentioning before. Let me see. Here we are. So, so Everlaw definitely has, if you can see my screen here, has a panel on the left hand side where you can see a variety of different contexts related to a document in review. So, if I open up the attachments here, I can come through and and and look at my, attachments and whatever those might be. And so that can be a way that you can map families, or create families with hyperlinked documents, treating them as, as standard email style attachments, and you can even edit, metadata just as you're going through in order to to build those families as you're going if you happen to already know what those documents are. But like I mentioned, there is purpose built, functionality to to really make life easier for reviewers, that automatically maps these, maps hyperlink documents when you're bringing them into Everlaw. And so that is another option down here. You can see it's it's distinct from the actual standard attachments here. It's the the link document's context. And I can see here that this document, actually does have one, linked document here. If yeah. It'll also tell you if there are links to other documents, but those other hyperlinked documents just weren't collected. That itself can be really useful, right, to just know, like, oh, this was pointing at several different documents, but we just don't have them all. But in this case, if I click on here, it'll take me to this particular, linked document, and then I can review that here in the context of the original team's message. Now so that is if I'm if I'm looking at the pointer docs, right, the linking documents. But at the same time, if I'm going through review and I'm and I come across, the hyperlinked doc itself, you'll notice here in this, linked doc context. Now there's nothing here under the outbound links. That's what we're calling, you know, when it's pointing to another file. But there's a backlink section here, that that just makes it really easy to come in and see, like, oh, there's actually four different documents here that linked to this picture. And it was, you know, people, after a company party or something where there is some artwork done, they they just wanna share out some of their, you know, one of their favorites, or I don't know what it might be. But, in this case, it can just be really helpful to know. It might be something much more key to to a case. Right? Like, oh, all these different people, were sharing this document, and, and, and that can be important to know in order to go investigate, and dive into the data a little more deeply. So wanted to mention just kind of show this as another possibility, that that has its own strengths and and can potentially make a difference when we're when we're thinking about