• Data Engineering as a Scientific Tool
    Jan 11 2023

    In this episode, host Peter Wang is joined by Dr. Patrick Kavanagh, an astrophysicist and software developer at the Dublin Institute for Advanced Studies. Patrick works on the James Webb Space Telescope (JWST), helping to write code that allows scientists to interpret the raw data they receive from space.

    Patrick talks to Peter about cleaning telescope data sets to make them more scientifically useful, and more. Patrick's team working on the Mid-Infrared Instrument on the JWST writes software in Python to help deliver science-ready data to astronomers and astrophysicists. Patrick's work facilitates more precise study of distant stars and galaxies in a way that fosters public trust.

    Peter Wang - https://www.linkedin.com/in/pzwang/

    Dublin Institute for Advanced Studies - https://www.linkedin.com/school/dublin-institute-for-advanced-studies/

    James Webb Space Telescope - https://webb.nasa.gov/

    Check out these relevant resources:

    • Dr. Patrick Kavanagh - EuroPython
    • Python and James Webb
    • Judy Schmidt (citizen scientist)

    If you enjoyed today's show, please leave a 5-star review. For more information, visit anaconda.com/podcast.

    #Computing #AI #Data #DataScience #Analytics

    Mehr anzeigen Weniger anzeigen
    1 Std. und 6 Min.
  • Optimizing Python for Speed and Compatibility
    Dec 28 2022

    In the penultimate episode of season one, host Peter Wang and Carl Meyer, Software Engineer at Instagram (owned by Meta), discuss considerations around making Python faster while maximizing compatibility and performance.

    Several years ago, Carl and his team started working on a project called Cinder in an effort to improve CPU efficiency across Meta's servers by "[optimizing] things at the level of Python runtime." While initially meant to serve as a stop gap, Cinder yielded impressive wins that transformed it into a premier and ongoing project at Instagram.

    In addition to Cinder, Peter and Carl discuss:

    - Carl's experiences with various programming languages like TI-Basic, Perl, and PHP

    - Challenges around innovating on an established language with 30+ years of history

    - The potential evolution of Python use cases and best practices

    - And more!

    Peter Wang - https://www.linkedin.com/in/pzwang/

    Carl Meyer - https://www.linkedin.com/in/carljm/

    Instagram - https://www.linkedin.com/company/instagram/

    Cinder - https://github.com/facebookincubator/cinder

    If you enjoyed today's show, please leave a 5-star review. For more information, visit https://www.anaconda.com/podcast.

    #Computing #AI #Data #DataScience #Analytics

    Mehr anzeigen Weniger anzeigen
    49 Min.
  • Climate Science, Scientific Computing, and Data Accessibility
    Dec 14 2022

    This episode's conversation between host Peter Wang and Ryan Abernathey, Associate Professor at Columbia University in the City of New York, explores climate science, scientific computing, data accessibility, and more.

    Topics that Peter and Ryan cover include:

    - Cloud computing

    - Open data and collaboration

    - Climate science and the private sector

    - Open-source projects like Pangeo Forge and Xarray

    Climate data is sometimes restricted in the way it flows between interested parties; the growth of private industry around data storage and dissemination has put up barriers to entry that can limit access to valuable systems and data. This is especially troubling to Ryan because these barriers often exclude some of the people who are most affected by climate change. He feels that usable information can and should be made accessible without undermining private interests.

    Peter Wang - https://www.linkedin.com/in/pzwang/

    Ryan Abernathey - https://www.linkedin.com/in/ryan-abernathey-32a70652/

    Columbia University in the City of New York - https://www.linkedin.com/school/columbia-university/

    Pangeo Forge - https://pangeo-forge.org/

    Xarray - https://docs.xarray.dev/en/stable/

    You can find a human-verified transcript of this episode here. - https://know.anaconda.com/rs/387-XNW-688/images/ANACON_%20Ryan%20Abernathey_HVT.docx.pdf

    If you enjoyed today's show, please leave a 5-star review. For more information, visit https://www.anaconda.com/podcast.

    Mehr anzeigen Weniger anzeigen
    56 Min.
  • Shaping Best Practices for Monitoring ML Models
    Nov 30 2022

    In this episode, host Peter Wang is joined by Elena Samuylova, CEO and Co-Founder of Evidently AI. Peter and Elena discuss how Evidently AI's open-source tooling is helping users monitor machine learning (ML) models, and why that's important.

    Elena has found that Evidently AI's open-source approach is attractive to data scientists and ML engineers who are ramping up model maintenance, retraining, and monitoring efforts.

    Peter and Elena also touch on:

    - On-premises versus cloud-based deployment

    - ML model monitoring best practices

    - The value of pipeline testing

    - And more!

    You can find a human-verified transcript of this episode here. - https://know.anaconda.com/rs/387-XNW-688/images/ANACON_%20Elena%20Samuylova_%20HVT.docx.pdf

    If you enjoyed today's show, please leave a 5-star review. For more information, visit anaconda.com/podcast.

    #ML #AI #Data #DataScience #Analytics

    Mehr anzeigen Weniger anzeigen
    35 Min.
  • Unifying and Accelerating Data Science, ML, and Advanced Analytics Workflows
    Nov 16 2022

    In this episode, host Peter Wang speaks with Torsten Grabs, Director of Product Management at Snowflake, about how Snowflake solutions support professionals in data science, machine learning, and advanced analytics.

    Torsten has worked with data throughout his entire career. At Snowflake, he focuses on Snowflake's data lake, data pipelines, and data science workloads, as well as Snowflake's developer and partner ecosystem.

    Thanks to the broader language compatibilities of Snowflake and its Snowpark library, data engineering is becoming more accessible beyond the SQL community. Torsten and Snowflake continue to work to unify and accelerate data workflows.

    Peter Wang - https://www.linkedin.com/in/pzwang/

    Tosten Grabs - https://www.linkedin.com/in/torstengrabs/

    Snowflake - https://www.linkedin.com/company/snowflake-computing/

    Learn more about Snowpark for Python, - https://www.snowflake.com/snowpark/ now generally available, - https://www.snowflake.com/news/snowflake-disrupts-application-development-with-general-availability-of-snowpark-for-python-native-streamlit-support-and-more/ and get started with the Snowpark Developer Guide for Python. - https://docs.snowflake.com/en/developer-guide/snowpark/python/index.html Then, dive into the Snowflake-Anaconda partnership - https://www.snowflake.com/blog/snowflake-partners-with-and-invests-in-anaconda-to-bring-enterprise-grade-open-source-python-innovation-to-the-data-cloud/ and learn how Snowflake customers like Allegis Group are leveraging Snowpark for Python. https://www.snowflake.com/blog/snowflake-partners-with-and-invests-in-anaconda-to-bring-enterprise-grade-open-source-python-innovation-to-the-data-cloud/

    Access Anaconda's State of Data Science report, referenced by Peter, here. - https://www.anaconda.com/state-of-data-science-report-2022

    You can find a human-verified transcript of this episode here. - https://know.anaconda.com/rs/387-XNW-688/images/ANACON_Paco_Nathan_V1.docx.pdf

    If you enjoyed today's show, please leave a 5-star review. For more information, visit anaconda.com/podcast.

    Mehr anzeigen Weniger anzeigen
    39 Min.
  • Autopoiesis in Systems of People and Machines
    Nov 2 2022
    In "Autopoiesis in Systems of People and Machines," Peter Wang welcomes Paco Nathan. Paco is a Managing Partner at Derwen, Inc., a company that offers enterprise customers full-stack engineering for AI applications at scale, with an emphasis on open-source integrations. Paco forged a career in artificial intelligence when many people were skeptical of it and now boasts over 40 years of computer science experience. Peter and Paco discuss histories and frameworks that are impacting today's systems of people and machines. Paco touches on corporate law and how long ago, the concept of insurance allowed for the externalization of risk and corresponding enablement of capital ventures. Paco goes on to talk about autopoiesis, the Chilean Project Cybersyn and the significance of groupware, and the core of human intelligence. Peter and Paco also discuss the increasing complexity of today's world in which less and less is linear, which requires improved cognition for survival, and the cybernetic future. Resources: "A Brief History Of Reinsurance" (David M. Holland) - https://www.soa.org/globalassets/assets/library/newsletters/reinsurance-section-news/2009/february/rsn-2009-iss65-holland.pdf Santa Clara County v. Southern Pacific Railroad Co., 118 U.S. 394 (1886) - https://supreme.justia.com/cases/federal/us/118/394/ "Law as an Autopoietic System" (Gunther Teubner) - https://cadmus.eui.eu/handle/1814/23894 Autopoiesis and Cognition: The Realization of the Living (Humberto Maturana and Francisco Varela) - https://en.wikipedia.org/wiki/Autopoiesis_and_Cognition:_The_Realization_of_the_Living Project Cybersyn - https://99percentinvisible.org/episode/project-cybersyn/ "Understanding Computers and Cognition" (Terry Winograd and Fernando Flores) - https://philpapers.org/rec/WINUCA Macy Conferences - https://en.wikipedia.org/wiki/Macy_conferences Norbert Wiener - https://en.wikipedia.org/wiki/Norbert_Wiener "What the Frog's Eye Tells the Frog's Brain" (J.Y. Lettvin et al.) - https://hearingbrain.org/docs/letvin_ieee_1959.pdf Social Systems - https://www.sup.org/books/title/?id=2225 Niklas Luhmann - https://en.wikipedia.org/wiki/Niklas_Luhmann Dubberly Design (Paul Pangaro) (When Paco references Donoho Design, he means Dubberly Design.) - http://www.dubberly.com/articles/cybernetics-and-design.html René Thom - https://en.wikipedia.org/wiki/Ren%C3%A9_Thom "Corporate Metabolism" (Paco Nathan) - https://www.tripzine.com/listing.php?id=corporate_metabolism You can find a human-verified transcript of this episode here - https://know.anaconda.com/rs/387-XNW-688/images/ANACON_Paco_Nathan_V1.docx.pdf If you enjoyed today's show, please leave a 5-star review. For more information, visit Anaconda.com/podcast.
    Mehr anzeigen Weniger anzeigen
    56 Min.
  • From "Enthusiastic User" to pandas Maintainer
    Oct 19 2022

    On this episode of Numerically Speaking: The Anaconda Podcast, host Peter Wang welcomes pandas maintainer Jeff Reback, Managing Director at Two Sigma.

    Jeff began his career on Wall Street in the 1990's and used Perl for a long time. He developed an interest in Python in the 2000's. He was then quickly drawn to pandas and began to spend his hour-long ferry commutes contributing to its open-source code. His contributions over the years have been significant, to say the least.

    When it comes to open source, says Peter, "my flame isn't diminished by lighting your candle." Cloning a copy of pandas, for example, does not make the original copy any less valuable. In fact, source code actually increases in value as it circulates.

    Until recently, only volunteers worked on pandas—but as of 2022, three full-time maintainers are paid to contribute, review code, and triage issues.

    Jeff's advice for anybody interested in contributing to open source? Find a community and just help out.

    Click https://www.youtube.com/watch?v=7JHqxODJG9k to check out "Two Sigma Presents Pandas at a Crossroads the Past Present and Future with Jeff Reback" on YouTube.

    You can find a human-verified transcript of this episode here - https://know.anaconda.com/rs/387-XNW-688/images/ANACON_Jeff%20Reback_V1.docx.pdf


    Resources:


    Peter Wang LinkedIn - https://www.linkedin.com/in/pzwang/


    Jeff Reback LinkedIn - https://www.linkedin.com/in/jeff-reback-3a20876/


    Two Sigma LinkedIn - https://www.linkedin.com/company/two-sigma-investments/

    If you enjoyed today's show, please leave a 5-star review. For more information, visit https://anaconda.com/podcast.

    Mehr anzeigen Weniger anzeigen
    39 Min.
  • A Specialized Approach to Hardware
    Oct 5 2022

    End users who are not schooled in hardware can often default to, "just give me something that works." David Liu, Staff AI Engineer, Strategy & Vision for Data Science and AI Products at Intel, understands this thinking but also believes that end users can be educated on the advantages of configuring their computer hardware to suit their specific needs.

    David advocates for using the right hardware for a given task—and that may mean different configurations and/or different machines for different tasks, rather than a one-size-fits-all solution.

    David and host Peter Wang also discuss:

    - The need for more education and resources around hardware performance

    - Intel's Optane technology and the possibilities it creates

    Resources:

    Peter Wang LinkedIn - https://www.linkedin.com/in/pzwang/

    David Liu LinkedIn - https://www.linkedin.com/in/david-liu-71004723/

    Intel LinkedIn - https://www.linkedin.com/company/intel-corporation/

    Click https://www.youtube.com/triskadecaepyon to visit David's YouTube channel.

    You can find a human-verified transcript of this episode here -

    https://know.anaconda.com/rs/387-XNW-688/images/ANACON_David%20Liu_V1%20%281%29.docx.pdf.

    If you enjoyed today's show, please leave a 5-star review. For more information, visit https://www.anaconda.com/podcast.

    Mehr anzeigen Weniger anzeigen
    45 Min.