top of page
Writer's pictureDeanne Bassili

Best Practices for Research Data Management in 2024

Updated: Mar 25

In 2021, over 22 billion data records were exposed in more than 4,125 breaches across the globe. Even worse, a recent survey found that nearly 75 percent of business executives do not completely trust their organization's data. How did this come to pass, and what can be done about it? In this article, we'll outline best practices for research data management that can help enterprises rein in data sprawl and start properly managing their critical information assets. Through robust data governance frameworks, enhanced data quality processes, and the latest research data management technologies, organizations can unlock deeper data-driven insights without assuming undue risk. You'll learn specific steps to stop treating data as an afterthought and make it a central component of corporate strategy.


1. Define Clear Data Management Goals 

Defining clear data management goals is a critical first step in establishing effective practices. But what exactly constitutes a “clear goal” when it comes to managing research data?

 

You may be wondering - how specific do these goals need to be? Let's explore this question together.


We'd recommend starting by asking yourself a few key questions: 

  1. What are the core objectives you want to accomplish through better data management?  

  2. What specific pain points or challenges are you hoping to overcome?

  3. How will reaching these goals tangibly improve and streamline your research workflow?


The more detailed you can be in answering these questions, the better. Vague goals like "improve data quality" or "make data more accessible" are an ok start, but not nearly targeted enough. 


You need concrete goals that zero in on tangible outcomes. For example:

  • Reduce time spent searching for relevant datasets by 40%

  • Increase findability and reuse of experimental data across teams

  • Decrease incorrect or duplicated data by implementing master data management

  • Limit vulnerable exposure of confidential data through access controls


See the difference? Outcomes like "faster search", "increased findability", "decreased errors", and "limited exposure" are very specific, measurable, and even actionable.


Of course, don't restrict yourself to just these examples. The right data management goals will be unique to your research context and challenges. Reflect on where the pain points exist in your current workflow and translate those into precise goals that address the root causes.


Doing so will provide a clear direction that orients all your data management efforts. It enables you to benchmark progress and know when you've succeeded. In essence, well-defined goals are like a compass guiding you as you navigate building more effective data management practices.


2. Build Strong File Naming and Cataloging Conventions

Strong file naming and cataloging conventions provide immense value in improving data management. But what exactly constitutes a "strong" convention when it comes to research data?


First, let's distinguish file naming from cataloging. File naming involves assigning descriptive names to individual files and folders. Cataloging refers to adding rich metadata tags and descriptions in a central catalog or repository.

 

Both help make data more findable and usable down the road. But they serve complementary purposes. File names allow quick identification of a file's contents immediately. Cataloging enables more robust searching, filtering, and organization.

With that distinction drawn, here are some best practices for each.


Best Practices for File Naming Conventions 

Some of the best practices when devising a naming convention are to:

  • Include core elements like project name, dataset type, date, and version. e.g. "ProjectX-SurveyData-2022-V1".

  • Avoid cryptic abbreviations and numbers. Spell out terms if needed.

  • Use delimiters like dashes, underscores, or periods to separate elements.

  • Keep individual names concise but make sequences meaningful.

  • Organize folders following similar naming logic.


Best Practices for Cataloging Conventions

Some of the best practices we've come across when formulating a cataloging convention are to:

  • Fully describe the dataset contents, source, timeframe, methodology, and relationships. 

  • Use controlled vocabulary when tagging rather than ad hoc words.

  • Include personnel contact info and any access instructions.

  • Specify the spatial and temporal coverage. 

  • Note related publications, grants, or projects.

  • Add any relevant URLs to internal or external resources. 

  • Identify which team or group "owns" the dataset.


The goal of both approaches is to make your data identifiable and understandable to others without needing explanations. Following thoughtfully crafted conventions reduces errors, duplication, and time wasted decoding messy or vague names and descriptions.


While it does require upfront effort to define and implement strong conventions, the investment pays ongoing dividends through improved discovery and understandability. Does your current approach live up to these best practices? If not, now may be the time to build more robust file naming and cataloging standards.


3. Carefully Consider Metadata for Data Sets

Metadata is a crucial component of effective data management. But what specific metadata should you track to maximize the value of your research data sets?

As a refresher, metadata refers to contextual information that describes the contents, structure, and provenance of data. Robust metadata is key for organizational findability and technical usability down the road.

  

When deciding what metadata to include, consider these categories:

  • Descriptive metadata - Provides overview of what the data contains. Include title, abstract, authors, keywords, related publications, and contact info.

  • Structural metadata - Describes internal organization like data types, software formats, and interrelated tables/files in a dataset. 

  • Administrative metadata - Captures technical details like file sizes, creation/modification dates, access permissions, and terms of use.

  • Statistical metadata - For statistical data, be sure to include info on methodology, variables, definitions, codes, scales, weights, and other analytic details.


The more metadata you can provide in each category, the better. But aim for a practical balance between comprehensiveness and efficiency given resource constraints.

Ideally, metadata should be automatically captured where possible, such as technical and administrative info. Seek tools that can extract or inherit metadata to reduce manual effort.


For descriptive fields, develop a standard template so metadata is consistently maintained. Training researchers on proper metadata practices is also wise to ensure buy-in and quality.


And remember - metadata's value is only realized through thoughtful organization and search functionality. The ability to quickly filter, sort, and find data based on associated metadata is just as essential.


Carefully considered metadata unlocks so many possibilities for efficient analysis, effective collaboration, and impactful data sharing down the road. What improvements could more robust metadata bring to your research data management practices?


4. Where and How You Store Data

The exponential growth of research data makes data storage an increasingly complex challenge. Where and how you store data has huge implications for access, security, costs, and analytical performance. Some key considerations.


Storage Infrastructure 

Assess your needs and options for on-premises servers vs cloud-based storage. Cloud offers convenience but may raise security concerns. Maintaining your own infrastructure gives more control but requires resources and comes with other responsibilities to manage backup, disaster recovery, and security. 


Data Types and Usage

Understand what types of data you store (e.g. genomic sequences, imaging data, survey results) and how frequently it is accessed. This informs needed capacity and performance. Hot vs cold data may go on different tiers.


Security and Compliance

Research data can contain sensitive and regulated information. Storage must safeguard confidentiality and meet compliance mandates like HIPAA in the United States or PIPEDA in Canada. Access controls, encryption, and audit logs are table stakes or a minimum requirement. And, if you’re doing research on Indigenous data, you will also need to consider the OCAP (Ownership, Control, Access, and Possession) requirements.


Cost Management

Data storage costs can spiral out of control. Estimate growth projections and employ tiered storage with cost-effective long-term options like tape or cold cloud storage to optimize spending. 

 

Backups and Recovery

Backup critical data redundantly both on-premises and off-site in case of outage or disaster. Also, it's really important to test recovery procedures on a regular basis.


File Formats and Compression

By using standardized and non-proprietary formats for long-term access this will help ease use across a collaboration network. Another trick is to compress and deduplicate unnecessary storage volume which will also minimize the risk. 


Metadata and Indexing

Making data discoverable through searchable metadata tags and a master index or catalog will help avoid "dark data" acquired through various computer networks that might go unused due to poor findability.

Managing research data storage requires balancing these factors strategically. As data volumes, privacy regulations, and analytical demands grow, revisiting your storage infrastructure and policies is wise. Investing in scalable and secure storage pays dividends in driving impact from data.


5. Properly Tag and Store Data

Proper tagging and storage of research data is critical for findability, integrity, and reuse. Here are some of our best practices:

  • Implement a standardized taxonomy for tagging datasets based on factors like project, subject, data type, collection method, etc. 

  • Make tagging comprehensive yet consistent. Strike a balance between precision and practicality.

  • Require researchers to tag datasets at project inception through templates, forms, or workflows. Avoid after-the-fact tagging.

  • Incorporate taxonomy terms into file naming conventions as well for quick identification.

  • Leverage auto-tagging capabilities built into data management systems when possible.

  • Store data thoughtfully using hierarchical folder structures and naming aligned to the taxonomy. 

  • Avoid deeply nested folders more than 3-4 levels deep to retain findability.

  • Version datasets appropriately as they get updated or superseded. Retire old versions.

  • Contextualize data with rich README files describing methodology, analyses, caveats, etc.

  • Handle sensitive data thoughtfully per security protocols and access controls.

  • Back up data redundantly both on-premises and off-site in case of failure or loss. 

  • Document data storage locations, organization, and lifecycles clearly in SOPs. 


Tagging and storage practices have huge downstream impacts on data utility and analytics. Take time upfront to implement conventions and systems that facilitate discovery and trust. Quality tagging and storage pay continuous dividends.


Research data often contains sensitive and confidential information. Everything from personal healthcare data to proprietary formulas and processes may fall into researchers' hands. That makes data security a huge responsibility. What steps can you take to make it a priority?


Start with a risk assessment - catalog what types of sensitive data you handle and any applicable regulations like HIPAA or GDPR in the EU. This helps quantify the potential impacts of a breach.


With risks understood, establish robust access controls. Allow only authorized personnel to access certain datasets or directories. Require strong passwords and multi-factor authentication for sensitive systems. 


When data is "at rest" in storage, leverage encryption to make it unreadable without keys if compromised. Similarly, encrypt data "in motion" during transfers.


Beware of insider threats through strict permission rules and monitoring employee access. Many breaches originate from within through intentional or careless actions. Ongoing staff security training is vital.


Have strong incident response plans ready for if a breach does occur. Timely containment, root cause analysis, and notification protocols are critical for minimizing damage.


Ultimately, cultivate a culture of security mindfulness. Researchers interacting with data daily should internalize good habits around access, sharing, and disposal.


Robust security provides confidence to collect, store, and extract value from sensitive data. But it requires foresight, investment, and vigilance given the attractiveness of research data to thieves. Is your security program comprehensive and adaptive enough to counter persistent threats? Making it a priority is key to both safeguarding data and unlocking its full potential.


6. Define Data Access Policies

Governing who can access research data, and under what conditions, is a fundamental data management responsibility. Clear data access policies balance openness and sharing with security and privacy. 

 

Start by classifying data by sensitivity levels. Is it non-confidential, confidential, or highly restricted? This provides a risk-based foundation for access rules.


For less sensitive data, focus on maximizing accessibility for collaborators to enable seamless sharing. For more restricted data, implement access controls like:

  • Role-based permissions where certain teams/individuals can access designated datasets.

  • Temporary access grants for limited periods like the duration of a project.

  • Read-only restrictions for certain users to prevent modification/deletion.

  • Data masking techniques to anonymize confidential information for wider accessibility.

  • Encryption and multi-factor authentication for highly sensitive datasets.

  • Watermarking or fingerprinting to trace data leaks if they do unfortunately occur. 


It’s important to document and communicate policies to all stakeholders. Make access simple and seamless where possible through SSO and APIs. Automate enforcement via permissions built into storage systems.


Revisit policies regularly as new data sources emerge or teams shift. Granting and revoking access in a timely manner is critical.


The goal is to enable data utility through sharing while retaining control over who can do what. Well-designed data access policies are a cornerstone of maximizing research data's value and minimizing risk. How can you improve the clarity and rigor of your policies?


7. Keep Data Clean

Dirty data severely impedes analysis and decision-making. From incorrect values to inconsistencies, duplicates, and missing elements, poor data quality has massive downstream impacts. Some of our best practices to keep research data clean:

  1. Establish data validation checks during entry/collection to catch issues early. Don't allow bad data in systems.

  2. Standardize taxonomies, naming conventions, metadata tags, etc. to maintain consistency across datasets. 

  3. Deduplicate regularly. Identify repeat records and determine a master authoritative record.

  4. Profile data periodically to check for outliers, gaps, and irregularities that may indicate errors.

  5. Build workflows to scrub, transform, and normalize data from various sources into properly structured destination formats.

  6. Enrich data by merging disparate datasets and adding identifiers to connect related records across systems. 

  7. Retire outdated datasets that may create confusion and inaccuracies if used.

  8. Provide researchers with high-quality reference data like master customer lists to minimize new errors.


We believe the key is being proactively vigilant. Establish data quality KPIs and metrics to continually monitor. Maintaining clean data is an ongoing initiative requiring people, processes, and enabling technology.


In the end, it leads to tremendous payoffs for analytics, reporting, operations, and decisions. And it builds trust and adoption by giving teams confidence in the validity of underlying data. Is poor data quality undermining your work? Focus first on preventing bad data from ever entering your ecosystem. Once in, act swiftly to fix or remove it through sound data cleansing practices.


8. Utilize Data Lineage

Understanding the origins and movement of data - its lineage - is crucial for troubleshooting, compliance, and trust. How can you track and leverage lineage?

  • Document data flows with diagrams mapping key systems and handoffs from raw sources to downstream usage. Keep updated.

  • Log pipeline runs and audit trails in ETL, ML, and other workflows to record data provenance under the hood.

  • Build a master data dictionary cataloging datasets, columns, origins, processing logic, quality rules, and usage.

  • Use schema comparison to identify changes between upstream sources and transformed outputs.

  • Implement data lineage tools to automatically visualize relationships, transformations, and handoffs end-to-end.

  • Tag certain regulatory or ethical data with identifiers that carry forward downstream to track its usage.  

  • Incorporate lineage metadata from systems like data catalogs into reporting for transparency.

  • Train data stewards and technologists on interpreting lineage maps to inform enhancement and troubleshooting.

  • Leverage lineage to ensure downstream datasets retain traceability back to authorized sources of record.

  • Use lineage flows to guide impact analysis when upstream changes occur. Identify affected outputs.


Getting control of data lineage provides tremendous benefits. But it requires systematically connecting the dots across disparate systems with documentation, logging, and monitoring. Make lineage a priority and unlock a powerful lens for managing and improving data practices.


9. Employ a Data Governance Strategy

We believe effective data management requires more than just technical capabilities - it demands the implementation of comprehensive data governance. A data governance strategy establishes roles, policies, and processes to manage data as a strategic asset. Here are some of the best practices we recommend:

  1. Appoint a chief data officer and/or data stewards for different domains as accountable business leaders overseeing data activities.

  2. Create a data governance council with cross-functional representation to set policies and standards.

  3. Develop guidelines dictating things like metadata standards, data quality thresholds, access levels, and acceptable use. 

  4. Build out processes for governing data end-to-end from acquisition to disposal, addressing collection, storage, security, lineage, etc.

  5. Implement supporting technologies like master data management, data catalogs, data lineage tools, and data quality monitoring.

  6. Establish issue escalation frameworks and change management protocols for data governance policies and processes.

  7. Train users in following established data guidelines through onboarding, workshops, certifications, etc. 

  8. Report on data governance KPIs like usage, quality metrics, policy compliance, risks, and issues for continuous improvement.


Effective data governance permeates people, processes, and technology. It provides the infrastructure for managing data strategically long-term through changing needs and growth. Take an inventory of your current data governance approach. Establish oversight roles and councils if lacking. Solidify critical guidelines and implement essential technologies for automation and analytics. View data governance as an evolving, collaborative endeavor between people and systems, not a one-off project. With robust governance, data transforms from a liability into an asset driving maximal insight and value.


Establish a Single Source of Truth

10. Establish a Single Source of Truth

Having fragmented, inconsistent data spread across siloed systems severely impedes analysis and decisions. A best practice is to establish a single source of truth for critical datasets. But how do you consolidate disparate data into authoritative master versions?


First, prioritize what data is most critical for core operations and analytics e.g., survey, IoT device data, genome, thoracic image data, etc. may be top candidates. Avoid trying to centralize everything at once.


For these high-value datasets, map out all the various sources and duplicate versions across databases, files, and tools. Document how each system of record is currently used.


Determine the system of record that will be the single authoritative data source going forward. It should have adequate governance, quality, and access capabilities. Retire redundant versions. 


Implement ETL (extract, transform, load), and data integration processes to reliably funnel and reconcile data from remaining sources into the new master dataset. Enforce uniqueness constraints. 


Establish data stewards to oversee the ongoing management of these mission-critical datasets as assets. Continually monitor their completeness, accuracy, and use.


Update all downstream reports, models, and applications to connect to the new single sources of truth rather than legacy datasets.


Master data management and data cataloging tools can provide crucial infrastructure. But procedural, access, and mindset changes are also key to adoption.


Single sources of truth may require an overhaul of decades-old practices in many organizations and could add a great deal of integrity post-publication in meeting open science, research repeatability, and FAIR principles.


Keep Research Data Management Documentation

11. Keep Documentation

Thorough documentation is essential for research data management yet it is often neglected. Undocumented data creates confusion, causes delays, and hinders reuse.

What practices produce effective documentation?

  • Document as you go rather than after the fact. Describe datasets, processes, systems, and policies in real time.

  • Store documentation alongside data, not just scattered files. README files help, but a centralized knowledge base is ideal.

  • Organize documentation for easy navigation - use wikis, tags, trees, and links rather than long paragraphs.

  • Standardize document templates and sections to maintain consistency across datasets, SOPs, systems, etc.

  • Cover key info like methodology, collection process, variables, analysis, provenance, quality checks, access instructions, and contacts.  

  • Diagram relationships between systems, workflows, and data sources for clarity.

  • Make documentation conversational and accessible to users with varying expertise. Avoid overly technical jargon.  

  • Establish documentation standards and provide training in proper techniques.

  • Assign documentation responsibilities clearly within teams and review regularly for quality.


Great documentation fuels efficient analysis, improves quality, prevents knowledge loss, enables collaboration, and much more. But it also requires habits, standards, and infrastructure. Is poor or scattered documentation hindering your team? Dedicate time to creating it alongside your processes and systems as you go, not after. View it as an integral part of the research lifecycle, not an afterthought. Your future self and colleagues will thank you.


Leverage Automation Technologies

12. Leveraging Automation Technologies

Managing research data manually can be extremely cumbersome and error-prone. Leveraging automation technologies is key to boosting efficiency, quality, and analytical capability. Some technologies to consider:

  • Metadata capture tools to automatically tag and catalog data assets rather than relying on manual entry.

  • Data quality tools that profile, cleanse, deduplicate, and enrich data.

  • Data catalogs that centrally index available datasets with search functionality. 

  • Workflow schedulers such as cron, Windows Task Scheduler, and Airflow that trigger chains of processes for routine data movement.

  • Reporting dashboards that monitor KPIs and metrics related to data quality, compliance, usage, etc. 

  • Data mapping tools that create visual lineage maps between data sources and outputs.


The goal is to remove manual bottlenecks and inconsistencies. Technologies provide scalable, rules-based automation shifting focus from tedious tasks to higher-value analysis and oversight.


But tools alone are not the answer. Also, assess processes and team skills. Your workflows should optimize human-machine handoffs. And remember, people need training to use technologies effectively.

 

Target high-volume, error-prone tasks for earliest automation. Even partial automation brings gains. Over time, build a flexible toolkit of technologies tailored to your data challenges and growth. An integrated fabric of automation underpins responsive, trustworthy data capabilities as research data volumes and complexity surge.


13. Monitor User Activity

With sensitive research data, monitoring user activity is crucial for security, compliance, and data integrity. Some best practices we encourage:

  • Enable comprehensive logging across systems storing or accessing data to provide audit trails of user actions.

  • For databases and warehouses, monitor queries, data access, failed logins, and other activities in addition to general user tracking. 

  • Use tools like security information and event management (SIEM) to aggregate and analyze logs for threat detection and incident investigation.

  • Employ user behavior analytics to establish normal baselines and alert anomalous activity indicative of insider risks.

  • Mask or obscure certain data in logs viewing sensitive info like health records. But retain enough metadata for monitoring. 

  • Establish dedicated reports and alert thresholds, and monitor workflows based on critical data and high-risk access points.

  • Train security personnel and data stewards on an interpretation of monitoring data to identify potential abuse or misuse for further action.

  • Periodically review logs and audit trails for compliance needs and optimize auditing configurations.

  • Remind users that monitoring is in place and that action will be taken for unauthorized or improper data access.


Monitoring is essential to information security programs. For research data, it also supports ethics and compliance. Make monitoring comprehensive across all systems yet focused on the highest-risk areas. And ensure that you have clear response protocols when irregularities arise. With robust activity monitoring and vigilance, potential data risks transform into managed insights.


14. Foster Collaboration

Research data's value multiplies when it can be combined, shared, and analyzed collaboratively. So, how can you foster greater openness and collaboration?

  1. Have clear data sharing policies - specify what can be shared openly, what requires review/approval, and terms of use. 

  2. Build self-service access processes like data portals for discoverability and dissemination. Automate approvals where possible.

  3. Anonymize confidential data to enable broader sharing without exposing sensitive information.

  4. Participate in consortiums and collaborative initiatives to harmonize data collection, analysis, and sharing across organizations.

  5. Structure datasets for interoperability using standardized formats and comprehensive metadata.  

  6. Provide analysis tools and interfaces that allow external users to work with data - don't just deliver raw data.

  7. Develop clear co-authorship and attribution policies for collaborators to incentivize sharing.

  8. Embrace preprints and open-access publishing to rapidly disseminate findings and underlying data.

  9. Promote a culture that values collaboration via training on responsible data sharing, collaborative project structures, etc.

  10. Require collaborators handling sensitive data to sign agreements around ethics, security, and responsible usage.  

  11. Build in protections like access restrictions, watermarking, and encryption to share data safely.


We believe fostering collaboration while protecting data requires forethought and when done right, accelerates discovery and delivers research with broader impact. Break down barriers to working jointly across teams and organizations. Yes, tools and policy matter, but so does culture. We suggest by nurturing collaboration, your data's dividends will multiply.


Choose User-Friendly Research Data Management Tools

15. Choose User-Friendly Tools

The success of any data management system hinges on adoption by users. Choosing tools prioritizing usability encourages utilization and reduces friction. Here are some of our suggested characteristics of user-friendly data tools:

  • Intuitive graphical interfaces requiring minimal training. Try and avoid overly complex UIs.

  • Natural language search makes it easy to discover and access relevant data.

  • Smart recommendations based on context and user history to surface relevant content.

  • Interoperability with existing systems and workflows for seamless integration. 

  • Flexible self-service access models giving users autonomy without bottlenecks.

  • Collaborative features like built-in comments, notifications, and sharing capability.

  • Options for both GUI and API access to accommodate different users.  

  • Responsiveness provides near real-time previews and fast query results. 

  • Mobile accessibility for use across devices.

  • Custom reporting and dashboarding capabilities without coding.


Prioritizing the user experience removes adoption barriers. But balancing ease of use with security and functionality is key. Leverage governance rules and role-based access behind the scenes to maintain control.


User experience goes beyond the interface. Also design supportive onboarding, training, and ongoing enhancement driven by user feedback. Usability, like security, should be woven into tools from the outset. When systems become frictionless and intuitive for researchers, data's true analytical potential is unleashed.


Research Data Management Trends

16. Stay Up to Date on Trends

Research data management is a dynamic field requiring proactive learning. Emerging technologies, best practices, regulations, and data sources should be monitored. Here are some of our helpful tips:

  1. Attend conferences like Canaire, RRCoP, or Internet2 for example, and review industry publications covering new data tools and strategies. Don't rely solely on internal perspectives.

  2. Follow thought leaders in the data space via blogs, social media, and newsletters to stay apprised of the latest trends and thinking.

  3. Join professional associations such as the Canadian Research Knowledge Network or Arenet and data management groups to plug into a knowledge-sharing community.

  4. Regularly re-evaluate internal practices against external innovation and evolving standards to identify gaps. 

  5. Pilot new technologies and methodologies selectively to gain hands-on experience rather than immediately committing.

  6. Interview peers at other top research organizations to compare approaches and outlooks. 

  7. Maintain an inquisitive, outward-looking mindset when encountering data challenges. Research if others have addressed it.

  8. Designate someone responsible for monitoring the external data landscape to synthesize and share relevant insights internally.

  9. Foster a culture of continuous learning around data, not just at implementation but continuously to optimize practices.


With technology and best practices constantly advancing, data management requires an agile, ever-learning approach. Make staying current a priority through dedicated time, resources, and knowledge sharing. If not, your practices risk stagnation amidst quickly changing external innovation and demands.


Start Gaining a Competitive Advantge

17. Start Gaining an Edge on Competition

Research is increasingly competitive. How you manage data can become a strategic advantage. Some ways to gain an edge:

  1. Implement self-service analytics tools empowering researchers to gain insights faster from data.

  2. Automate repetitive data prep and processing tasks so teams can focus on higher-value analysis. 

  3. Provide secure collaboration environments to seamlessly share sensitive data with partners.

  4. Build custom dashboards delivering real-time visibility into KPIs like project progress and data quality.

  5. Leverage capabilities like AI and machine learning to uncover hidden insights within complex data.

  6. Hire dedicated data stewards and analysts to support teams with day-to-day data needs.

  7. Catalog datasets thoroughly with rich metadata so researchers can easily find the most relevant data.

  8. Structure unstructured data through ETLs and master data management for superior analysis. 

  9. Establish strong data governance and quality standards giving teams confidence in data accuracy.

  10. Migrate siloed legacy systems into modern scalable architecture for agility and performance.

  11. Implement robust data security so researchers can collect, store, and analyze sensitive data safely. 

  12. Develop APIs and standards for seamless data exchange with collaborators and repositories. 


Managing research data strategically, not just operationally, is crucial for competitive advantage. Elevating your data maturity unlocks new opportunities and accelerates discoveries before peers. But focus on high-impact changes delivering tangible value to researchers. With even marginal improvements, data becomes an asset for breakthroughs rather than a liability.


18. Continually Review Processes

Research data management requires regularly reviewing and optimizing processes as needs evolve. Some helpful tips we suggest are:

  1. Establish routine meetings to critically assess what's working well and what's not across data collection, storage, security, analytics, etc.

  2. Incorporate user feedback into reviews to surface pain points and opportunities from those on the front lines. 

  3. Benchmark processes against best practices, standards, and peers to identify potential gaps.

  4. Monitor process metrics like cycle times, errors, cost per transaction, and investigate root causes of inefficiencies.

  5. Map out processes end-to-end, diagramming connections and handoffs between systems and teams. Look for redundant steps.

  6. Question long-standing legacy practices that may now be outdated or unnecessary with technological improvements.

  7. Pilot and Alpha/Beta test process changes before full-scale implementation to validate improvements.

  8. Phase optimizations over time as not everything needs to be revamped at once. Suggest you focus on quick wins first.

  9. Communicate process changes through training and new standard operating procedures (SOP) documentation to support adoption.

  10. Automate manual processes wherever possible for consistency and time savings.


Research data management must balance agility with thoughtfulness. Be open to rethinking entrenched practices, but don't change solely for change’s sake. Ground process reviews in tangible metrics and researcher needs. Iteratively fine-tune over time. With regular, rigorous assessment, you will continuously adapt and improve the processes underpinning your data capabilities.



19. Powerful Blockchain in Research Data Management 

Blockchain technology offers intriguing possibilities for research data management. For example, can Blockchain Secure Sensitive Research Data in a University setting? We think it can! At myLaminin we’ve combined the power of blockchain with the ability to securely share research data with security, convenience and control. We’ve done this by

  • Enhancing provenance tracking and audit trails by recording data transactions and workflow events immutably on a blockchain.

  • Sharing datasets securely with collaborators by granting access via blockchain keys rather than usernames/passwords.

  • Preserving integrity for critical datasets using blockchain to create immutable records that detect tampering.

  • Managing consent and sharing permissions via smart contracts that execute data access agreements on a blockchain.

  • Developing research data marketplaces and exchanges to share datasets and algorithms using blockchain for payments, contracts, and digital rights.

  • Archiving datasets like trial data on blockchain to retain permanent tamper-proof records with timestamps.

Conclusion

Managing research data effectively is a multifaceted endeavor requiring strategy and diligence. But taking even small steps to improve practices pays dividends. 


Start by identifying your biggest pain points and bottlenecks. Is it data discovery? Collaboration? Monitoring and security? Focus energy on pragmatic solutions to these high-impact areas first.


Make incremental enhancements, measuring impact along the way. There is no one-size-fits-all. Tailor your data management evolution to research needs and constraints. 


While long-term aspirations are important, don't let the perfect be the enemy of the good. Optimize what you can today, independent of larger modernization plans.

And remember - technology is only one piece of the puzzle. Improving skills and mindsets through training and communication ensures adoption. 


And, no research organization has all aspects of data management figured out. But by learning from peers, evaluating new tools, challenging assumptions, and staying agile, we’re confident you will continuously strengthen the foundations for impactful discoveries.


The recommendations we’ve provided here aim to help assess current practices and uncover opportunities. But you and your teams are best positioned to chart an effective best practices data management course befitting your research realities. We believe that with a thoughtful roadmap and commitment to continual improvement, your data will become an increasingly valuable asset propelling insights and advancement.

__________________________________


Deanne Bassili (article author) is the COO of myLaminin, a secure research data management platform for academia using blockchain and web3 technologies to secure sensitive research data and reduce operational inefficiencies for principal investigators, their team members, external collaborators, research legal services, research ethics boards, research librarians, IT, and administrators.

Comments


Image by Andrew Neel
bottom of page