my-finetuned-model / README.md
Vishnu7796's picture
Add new SentenceTransformer model
57dd453 verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:647
  - loss:MultipleNegativesRankingLoss
base_model: sentence-transformers/all-distilroberta-v1
widget:
  - source_sentence: Google Sheets expertise, data validation, report restructuring
    sentences:
      - >-
        Requirements: We're looking for a candidate with exceptional proficiency
        in Google Sheets. This expertise should include manipulating, analyzing,
        and managing data within Google Sheets. The candidate should be
        outstanding at extracting business logic from existing reports and
        implementing it into new ones. Although a basic understanding of SQL for
        tasks related to data validation and metrics calculations is beneficial,
        the primary skill we are seeking is proficiency in Google Sheets. This
        role will involve working across various cross-functional teams, so
        strong communication skills are essential. The position requires a
        meticulous eye for detail, a commitment to delivering high-quality
        results, and above all, exceptional competency in Google Sheets


        Google sheet knowledge is preferred.Strong Excel experience without
        Google will be considered.Data Validation and formulas to extract data
        are a mustBasic SQL knowledge is required.Strong communications skills
        are requiredInterview process: 2 or 3 round. Excel (Google) skill test
        assessment.
      - >-
        Requirements


        We are seeking 3+ years of related experience and a bachelor's or
        advanced degree in STEM from an accredited institution.Active in scope
        DoD TS/SCI security clearance. Ability to conduct analysis and import /
        ingest test data sets into the ArcGIS platform. Support testing events
        and ensure the data is collected and brought back for ingestion. Must
        possess the ability to work independently with minimal oversight while
        maintaining focus on research objectives defined by the client.


        What We Can Offer You

         We’ve been named a Best Place to Work by the Washington Post. Our employees value the flexibility at CACI that allows them to balance quality work and their personal lives. We offer competitive benefits and learning and development opportunities. We are mission-oriented and ever vigilant in aligning our solutions with the nation’s highest priorities. For over 55 years, the principles of CACI’s unique, character-based culture have been the driving force behind our success.

        Company Overview


        CACI is an Equal Opportunity/Affirmative Action Employer. All qualified
        applicants will receive consideration for employment without regard to
        race, color, religion, sex, sexual orientation, gender identity,
        national origin, disability, status as a protected veteran, or any other
        protected characteristic.


        Pay Range: There are a host of factors that can influence final salary
        including, but not limited to, geographic location, Federal Government
        contract labor categories and contract wage rates, relevant prior work
        experience, specific skills and competencies, education, and
        certifications. Our employees value the flexibility at CACI that allows
        them to balance quality work and their personal lives. We offer
        competitive compensation, benefits and learning and development
        opportunities. Our broad and competitive mix of benefits options is
        designed to support and protect employees and their families. At CACI,
        you will receive comprehensive benefits such as; healthcare, wellness,
        financial, retirement, family support, continuing education, and time
        off benefits. Learn more here


        The Proposed Salary Range For This Position Is


        $74,600-$156,700
      - >-
        requirements and develop solutions that meet those needs.Stay up-to-date
        with emerging trends and technologies in robotics, machine learning, and
        UAS technology.


        Due to the nature of the work, the selected applicant must be able to
        work onsite.


        Qualifications We Require

         Bachelor's degree in Computer Engineering, Computer Science, Electrical Engineering, Software Engineering, Mechanical Engineering, Optical Science, Robotics, or related STEM field. A higher-level degree (MS, PhD) in rellevant field may also be considered in lieu of Bachelor's degree. Equivalent experience in lieu of degree must be directly related experience that demonstrate the knowledge, skills, and ability to perform the duties of the job.  Ability to obtain and maintain a DOE Q-level security clearance. 

        Qualifications We Desire

         Strong knowledge of computer vision, deep learning, and other machine learning techniques.  Strong written communication skills (e.g., published research in technical journals)  Desire to work on solutions to National Security problems, especially in counter-autonomy and physical security system applications.  Ability to work in a fast-paced environment with multiple priorities and tight deadlines.  Demonstrated ability to perform machine learning related activities such as pipeline development, model explainability, and uncertainty quantification.  Strong teamwork and leadership skills.  Ability to travel domestically and internationally as needed (less than 15% of the time).  Experience in the following:  Python, ROS, and other scripting and scientific computing languages (R, C++, Java, C#)  Simulation software such as Gazebo.  Simulation engines such as Unreal or Unity.  3D modeling software.  Linux/Unix operating systems.  FPGAs.  Familiarity with embedded systems and microcontrollers.  Multi-sensor data fusion and coordination.  Active DOE Q-level or DOD equivalent security clearance. 

        About Our Team


        The Mission of department 6534 is to counter evolving autonomous threats
        to key national facilities and to improve the performance of physical
        security systems protecting those sites. We are part of a larger group
        focused on Autonomy and Unmanned Systems. We address real-world problems
        through research, development, testing, and evaluation of components and
        systems to advance the science of physical security. This enables
        customers to mitigate threats to these facilities by improving the
        ability to sense, assess, track, and respond to physical incursions. Our
        work addresses current physical security operational challenges and
        evolving threats such as unmanned aircraft systems (UAS). We specialize
        in the testing and evaluation of Counter-UAS (C-UAS) systems, which
        counter the danger posed by UAS, and we are the C-UAS test agent for
        DOE, NNSA, and DHS.


        Posting Duration


        This posting will be open for application submissions for a minimum of
        seven (7) calendar days, including the ‘posting date’. Sandia reserves
        the right to extend the posting date at any time.


        Security Clearance


        Sandia is required by DOE to conduct a pre-employment drug test and
        background review that includes checks of personal references, credit,
        law enforcement records, and employment/education verifications.
        Applicants for employment need to be able to obtain and maintain a DOE
        Q-level security clearance, which requires U.S. citizenship. If you hold
        more than one citizenship (i.e., of the U.S. and another country), your
        ability to obtain a security clearance may be impacted.


        Applicants offered employment with Sandia are subject to a federal
        background investigation to meet the requirements for access to
        classified information or matter if the duties of the position require a
        DOE security clearance. Substance abuse or illegal drug use,
        falsification of information, criminal activity, serious misconduct or
        other indicators of untrustworthiness can cause a clearance to be denied
        or terminated by DOE, resulting in the inability to perform the duties
        assigned and subsequent termination of employment.




        All qualified applicants will receive consideration for employment
        without regard to race, color, religion, sex, sexual orientation, gender
        identity, national origin, age, disability, or veteran status and any
        other protected class under state or federal law.


        NNSA Requirements For MedPEDs


        If you have a Medical Portable Electronic Device (MedPED), such as a
        pacemaker, defibrillator, drug-releasing pump, hearing aids, or
        diagnostic equipment and other equipment for measuring, monitoring, and
        recording body functions such as heartbeat and brain waves, if employed
        by Sandia National Laboratories you may be required to comply with NNSA
        security requirements for MedPEDs.


        If you have a MedPED and you are selected for an on-site interview at
        Sandia National Laboratories, there may be additional steps necessary to
        ensure compliance with NNSA security requirements prior to the interview
        date.


        Job ID: 693235
  - source_sentence: Data analysis, operations reporting, SQL expertise
    sentences:
      - >-
        experience in data engineering, software engineering, data analytics, or
        machine learning.Strong expertise working with one or more cloud data
        platforms (Snowflake, Sagemaker, Databricks, etc.)Experience managing
        Snowflake infrastructure with terraform.Experience building batch, near
        real-time, and real-time data integrations with multiple sources
        including event streams, APIs, relational databases, noSQL databases,
        graph databases, document stores, and cloud object stores.Strong ability
        to debug, write, and optimize SQL queries in dbt. Experience with dbt is
        a must.Strong programming experience in one or more modern programming
        languages (Python, Clojure, Scala, Java, etc.)Experience working with
        both structured and semi-structured data.Experience with the full
        software development lifecycle including requirements gathering, design,
        implementation, testing, deployment, and iteration.Strong understanding
        of CI/CD principles.Strong ability to document, diagram, and deliver
        detailed presentations on solutions.

        Preferred Experience:Expertise managing and integrating with cloud data
        streaming platforms (Kinesis Data Streams, Kafka, AWS SNS/SQS, Azure
        Event Hubs, StreamSets, NiFi, Databricks, etc.)Expertise in working with
        cloud data integration platforms (Airflow / AWS MWAA, Snowflake
        Snowpipe, Kinesis Data Firehose, AWS Glue / Glue schema registry, Azure
        Data Factory, AWS DMS, Fivetran, Databricks, Dell Boomi, etc.)Experience
        building data infrastructure in a cloud environment using one or more
        infrastructure as code tools (Terraform, AWS CloudFormation, Ansible,
        etc.)Production experience with one or more cloud machine learning
        platforms (AWS Sagemaker, Databricks ML, Dataiku, etc.)Understanding of
        machine learning libraries (MLlib, Scikit-learn, Numpy, Pandas,
        etc.)Experience managing data governance and security enablement
        (role-based access, authentication, network isolation, data quality,
        data transparency, etc.) on a cloud data warehouse, especially
        Snowflake.Experience building and optimizing data models with tools like
        dbt and Spark.Experience integrating with data visualization tools
        (Sisense, Tableau, PowerBI, Looker, etc.)Our data engineering and
        analytics stack includes Snowflake, dbt, Fivetran, Airflow, AWS,
        Sagemaker, and Python programming for custom data engineering. We use
        Sisense and Sigma for BI capability. Experience with this or similar
        tool would be preferred. Data team owns the provisioning and
        administration of all the tools we work with.

        BENEFITS:Comprehensive and affordable insurance benefitsUnlimited paid
        time off policy401(k) enrollment9 paid company holidaysPaid parental
        leave

        Employment at Splash is based on individual merit. Opportunities are
        open to all, without regard to race, color, religion, sex, creed, age,
        handicap, national origin, ancestry, military status, veteran status,
        medical condition, marital status, sexual orientation, affectional
        preference, or other irrelevant factors. Splash is
      - >-
        experiences Spectrum is known for.


        BE PART OF THE CONNECTION


        As a Data Scientist in the Credit Services department, you’ll work in a
        fast-paced, collaborative environment to develop data-driven solutions
        to Charter’s business problems. You’ll be empowered to think of new
        approaches, use analytical, statistical and programming skills to
        analyze and interpret data sets, and learn new skills while growing your
        career with Spectrum.


        What Our Data Scientists Enjoy Most


        Leveraging knowledge in analytical and statistical algorithms to assist
        stakeholders in improving their businessPartnering on the design and
        implementation of statistical data quality procedures for existing and
        new data sourcesCommunicating complex data science solutions, concepts,
        and analyses to team members and business leadersPresenting data
        insights & recommendations to key stakeholdersEstablishing links across
        existing data sources and finding new, interesting data
        correlationsEnsuring testing and validation are components of all
        analytics solutions


        You’ll work in a dynamic office environment. You’ll excel in this role
        if you are a self-starter who can work independently as well as in a
        team. If you’re comfortable presenting data and findings in front of
        team members & stakeholders and have excellent problem-solving skills,
        this could be the role for you.


        Required Qualifications


        WHAT YOU’LL BRING TO SPECTRUM


        Experience: Data analytics experience: 3 years, programming experience:
        2 yearsEducation: Bachelor’s degree in computer science, statistics, or
        operations research, or equivalent combination of education and
        experienceTechnical skills: Python, R, comprehensive SQL skill, Spark,
        HiveSkills: Experience with analytics and modeling on large datasets
        encompassing millions of records; Experience with the full model
        development and implementation cycle from ideation; Research, train and
        test models to model implementationAbilities: Perform in-depth &
        independent research and analysis; Experience using a data science
        toolkit such as Python or R, command of statistical techniques and
        machine learning algorithms; Ability to work with minimum supervision;
        Effective communication, verbal and written, relationship management,
        and customer service skills with a focus on working effectively in a
        team environmentTravel: As required (10%)


        Preferred Qualifications


        Education: Graduate degree in statistics, mathematics, analytics or
        operations researchExperience: Experience in working with large consumer
        data to discern consumer behaviors and risk profiles, ideally in
        telecommunication or banking industries.


        SPECTRUM CONNECTS YOU TO MORE


        Dynamic Growth: The growth of our industry and evolving technology
        powers our employees’ careers as they move up or around the
        companyLearning Culture: We invest in your learning, and provide paid
        training and coaching to help you succeedSupportive Teams: Be part of a
        strong community that gives you opportunities to network and grow, and
        wants to see you succeed Total Rewards: See all the ways we invest in
        you—at work and in life


        Apply now, connect a friend to this opportunity or sign up for job
        alerts!


        BDA303 2023-25170 2023


        Here, employees don’t just have jobs, they build careers. That’s why we
        believe in offering a comprehensive pay and benefits package that
        rewards employees for their contributions to our success, supports all
        aspects of their well-being, and delivers real value at every stage of
        life.


        A qualified applicant’s criminal history, if any, will be considered in
        a manner consistent with applicable laws, including local ordinances.


        Get to Know Us Charter Communications is known in the United States by
        our Spectrum brands, including: Spectrum Internet®, TV, Mobile and
        Voice, Spectrum Networks, Spectrum Enterprise and Spectrum Reach. When
        you join us, you’re joining a strong community of more than 101,000
        individuals working together to serve more than 32 million customers in
        41 states and keep them connected to what matters most. Watch this video
        to learn more.


        Who You Are Matters Here We’re committed to growing a workforce that
        reflects our communities, and providing equal opportunities for
        employment and advancement.
      - >-
        requirements, determine technical issues, and design reports to meet
        data analysis needsDeveloping and maintaining web-based dashboards for
        real-time reporting of key performance indicators for Operations.
        Dashboards must be simple to use, easy to understand, and
        accurate.Maintenance of current managerial reports and development of
        new reportsDevelop and maintain reporting playbook and change logOther
        duties in the PUA department as assigned


        What YOU Will Bring To C&F


        Solid analytical and problem solving skillsIntuitive, data-oriented with
        a creative, solutions-based approachAbility to manage time, multi-task
        and prioritizes multiple assignments effectivelyAbility to work
        independently and as part of a teamAble to recognize and analyze
        business and data issues with minimal supervision, ability to escalate
        when necessaryAble to identify cause and effect relationships in data
        and work process flows


        Requirements


        3 years in an Analyst role is requiredA Bachelor’s degree in associated
        field of study; data science, computer science, mathematics, economics,
        statistics, etc. is requiredExperience using SQL is requiredExperience
        with common data science toolkits is requiredPrior experience creating
        operations analysis


        What C&F Will Bring To You


        Competitive compensation packageGenerous 401K employer match Employee
        Stock Purchase plan with employer matchingGenerous Paid Time
        OffExcellent benefits that go beyond health, dental & vision. Our
        programs are focused on your whole family’s wellness including your
        physical, mental and financial wellbeingA core C&F tenant is owning your
        career development so we provide a wealth of ways for you to keep
        learning, including tuition reimbursement, industry related
        certifications and professional training to keep you progressing on your
        chosen pathA dynamic, ambitious, fun and exciting work environmentWe
        believe you do well by doing good and want to encourage a spirit of
        social and community responsibility, matching donation program,
        volunteer opportunities, and an employee driven corporate giving program
        that lets you participate and support your community


        At C&F you will BELONG


        We value inclusivity and diversity. We are committed to 


        Crum & Forster is committed to ensuring a workplace free from
        discriminatory pay disparities and complying with applicable pay equity
        laws. Salary ranges are available for all positions at this location,
        taking into account roles with a comparable level of responsibility and
        impact in the relevant labor market and these salary ranges are
        regularly reviewed and adjusted in accordance with prevailing market
        conditions. The annualized base pay for the advertised position, located
        in the specified area, ranges from a minimum of $68,000 to a maximum of
        $113,300. The actual compensation is determined by various factors,
        including but not limited to the market pay for the jobs at each level,
        the responsibilities and skills required for each job, and the
        employee’s contribution (performance) in that role. To be considered
        within market range, a salary is at or above the minimum of the range.
        You may also have the opportunity to participate in discretionary equity
        (stock) based compensation and/or performance-based variable pay
        programs.
  - source_sentence: Data analysis, dashboard development, root cause analysis
    sentences:
      - >-
        skills to help establish routine reporting, conduct root cause analysis,
        and continuously improve data quality and processes.

        Experience in data analysis, problem-solving, or data scienceProficiency
        in Excel required, with experience in Tableau, SQL, or SAS
        preferred.Open to using various technologiesA mix of technical skills
        and the ability to learn supply chain domain knowledgeStrong
        communication and storytelling skillsEntrepreneurial mindset with
        flexibility to work in a dynamic environment

        Soft Skills Needed:Problem solving - Ability to creatively solve
        problems through data analysis.Curiosity - A curious nature and
        willingness to learn. Carter prioritizes this over
        experience.Entrepreneurial mindset - Comfort with ambiguity and
        willingness to work scrappy in a dynamic environment.Critical thinking -
        Ability to think critically about data and uncover
        insights.Communication - Comfort communicating findings to
        cross-functional teams.Adaptability - Openness to different perspectives
        and willingness to be influenced by new ideas.Go-getter attitude -
        Self-starter mentality who is comfortable wearing multiple hats.

        Qualities of Successful Candidates:Carter is seeking a problem-solver
        first and foremost, not a supply chain expert. He prioritizes soft
        skills over industry experience.We are looking for a self-starter who is
        eager to take ownership of this role.This is an opportunity for hands-on
        experience working directly with a senior leader to help transform data
        and processes.The ideal candidate will be a creative problem-solver who
        thrives in an ambiguous environment.The data environment is dynamic and
        ambiguous with limited resources currently. Candidates should be
        comfortable with uncertainty.
      - >-
        experienced data analysts/scientists.


        Qualifications


        Master's Degree and at least 3 years of relevant experience.Strong
        Organization and time line management skills .Experience in AI/ML
        modeling approaches such as: metabolic modeling, convolutional neural
        networks, and Gradient-weighted Class Activation Mapping.Understand all
        phases of the analytic process including data collection, preparation,
        modeling, evaluation, and deployment.


        Anticipated hiring range: $100,000 - $120,000 / annual


        To Apply


        Please visit UVA job board: https://jobs.virginia.edu and search for
        “R0056431”


        Complete An Application And Attach


        Cover LetterCurriculum Vitae 


        Please note that multiple documents can be uploaded in the box.


        INTERNAL APPLICANTS: Please search for "find jobs" on your workday home
        page and apply using the internal job board.


        Review of applications will begin January 22, 2024 and continue until
        the position is filled.


        For questions about the position, please contact: Adam Greene, Research
        Program Officer ([email protected]) For questions about the
        application process, please contact: Rhiannon O'Coin ([email protected])


        For more information about the School of Data Science, please see
        www.datascience.virginia.edu


        For more information about the University of Virginia and the
        Charlottesville community, please see
        www.virginia.edu/life/charlottesville and www.embarkuva.com


        The selected candidate will be required to complete a background check
        at the time of the offer per University policy.


        PHYSICAL DEMANDS This is primarily a sedentary job involving extensive
        use of desktop computers. The job does occasionally require traveling
        some distance to attend meetings, and programs.


        The University of Virginia, including the UVA Health System which
        represents the UVA Medical Center, Schools of Medicine and Nursing, UVA
        Physician’s Group and the Claude Moore Health Sciences Library, are
        fundamentally committed to the diversity of our faculty and staff. We
        believe diversity is excellence expressing itself through every person's
        perspectives and lived experiences. We are equal opportunity and
        affirmative action employers. All qualified applicants will receive
        consideration for employment without regard to age, color, disability,
        gender identity or expression, marital status, national or ethnic
        origin, political affiliation, race, religion, sex (including
        pregnancy), sexual orientation, veteran status, and family medical or
        genetic information.
      - >-
        SKILLS AND EXPERIENCE4+ years of experience in machine learning and
        software engineeringMultiple years of experience deploying machine
        learning and statistical models into real world applicationsExperience
        writing production level codeGood communication skills and experience
        working cross functionally with non technical teamsExperience with
        techniques such as classification, regression, tree-based methods, or
        anomaly detectionHuge Plus: Experience in pricing or automotive
        industry!Tools: Python, Spark, Pyspark THE BENEFITSAs a Senior Machine
        Learning Engineer, you can expect a base salary between $150,000 to
        $180,000 (based on experience) plus competitive benefits. HOW TO
        APPLYPlease register your interest by sending your CV to Kristianna
        Chung via the Apply link on this page
  - source_sentence: >-
      Data Visualization with Power BI, Advanced Analytics Model Deployment,
      Azure Analytics Services
    sentences:
      - >-
        experience, skills and abilities will determine where an employee is
        ultimately placed in the pay range.


        Category/Shift


        Salaried Full-Time


        Physical Location:


        6420 Poplar Avenue


        Memphis, TN


        Flexible Remote Work Schedule


        The Job You Will Perform


        Lead the hands-on IT development and deployment of data science and
        advanced analytics solutions for the North American Container (NAC)
        division of International Paper to support business strategies across
        approximately 200 packaging and specialty plants in the US and
        MexicoBreak down complex data science methodologies to business leaders
        in a way that is applicable to our North American Container business
        strategy.Identify opportunities for improving business performance and
        present identified opportunities to senior leadership; proactively
        driving the discovery of business value through data.Collaborate
        directly with NAC business partners to produce user stories, analyze
        source data capabilities, identify issues and opportunities, develop
        data models, and test and deploy innovative analytics solutions and
        systemsLead the application of data science techniques to analyze and
        interpret complex data sets, providing insights and enabling data-driven
        decision-making for North American ContainerLead analytics projects
        through agile or traditional project management methodologiesInfluence
        IT projects/initiatives with project managers, business leaders and
        other IT groups without direct reporting relationships.Work closely with
        IT Application Services team members to follow standards, best
        practices, and consultation for data engineeringRole includes: Data
        analysis, predictive and prescriptive modeling, machine learning, and
        algorithm development; collaborating and cross-training with analytics
        and visualization teams.Under general direction works on complex
        technical issues/problems of a large scope, impact, or importance.
        Independently resolves complex problems that have significant cost.
        Leads new technology innovations that define new “frontiers” in
        technical direction


        The Skills You Will Bring 


        Bachelor’s degree in Computer Science, Information Technology,
        Statistics, or a related field is required. A Masters degree and/or PhD
        is preferred.Minimum 12 years of relevant work experience, less if
        holding a Masters or PhD.Skills with Data Visualization using tools like
        Microsoft Power BIDemonstrated leadership in building and deploying
        advanced analytics models for solving real business problems.Strong
        Interpersonal and Communication SkillsAdaptable to a changing work
        environment and dealing with ambiguity as it arises. Data Science
        Skills:Data analysisPredictive and Prescriptive ModelingMachine Learning
        (Python / R)Artificial Intelligence and Large Language ModelsAlgorithm
        DevelopmentExperience with Azure Analytics ServicesCompetencies:Dealing
        with AmbiguityFunctional / Technical Skills Problem SolvingCreativity

        The Benefits You Will Enjoy


        Paid time off including Vacation and Holidays Retirement and 401k
        Matching ProgramMedical & Dental Education & Development (including
        Tuition Reimbursement)Life & Disability Insurance


        The Career You Will Build


        Leadership trainingPromotional opportunities


        The Impact You Will Make


        We continue to build a better future for people, the plant, and our
        company! IP has been a good steward of sustainable practices across
        communities around the world for more than 120 years. Join our team and
        you’ll see why our team members say they’re Proud to be IP.


        The Culture You Will Experience


        International Paper promotes employee well-being by providing safe,
        caring and inclusive workplaces. You will learn Safety Leadership
        Principles and have the opportunity to opt into Employee Networking
        Circles such as IPVets, IPride, Women in IP, and the African American
        ENC. We invite you to bring your uniqueness, creativity, talents,
        experiences, and safety mindset to be a part of our increasingly diverse
        culture.


        The Company You Will Join


        International Paper (NYSE: IP) is a leading global supplier of renewable
        fiber-based products. We produce corrugated packaging products that
        protect and promote goods, and enable worldwide commerce, and pulp for
        diapers, tissue and other personal care products that promote health and
        wellness. Headquartered in Memphis, Tenn., we employ approximately
        38,000 colleagues globally. We serve customers worldwide, with
        manufacturing operations in North America, Latin America, North Africa
        and Europe. Net sales for 2021 were $19.4 billion. Additional
        information can be found by visiting InternationalPaper.com.


        International Paper is an Equal Opportunity/Affirmative Action Employer.
        All qualified applicants will receive consideration for employment
        without regard to sex, gender identity, sexual orientation, race, color,
        religion, national origin, disability, protected veteran status, age, or
        any other characteristic protected by law.
      - >-
        skills and business acumen to drive impactful results that inform
        strategic decisions.Commitment to iterative development, with a proven
        ability to engage and update stakeholders bi-weekly or as necessary,
        ensuring alignment, feedback incorporation, and transparency throughout
        the project lifecycle.Project ownership and development from inception
        to completion, encompassing tasks such as gathering detailed
        requirements, data preparation, model creation, result generation, and
        data visualization. Develop insights, methods or tools using various
        analytic methods such as causal-model approaches, predictive modeling,
        regressions, machine learning, time series analysis, etc.Handle large
        amounts of data from multiple and disparate sources, employing advanced
        Python and SQL techniques to ensure efficiency and accuracyUphold the
        highest standards of data integrity and security, aligning with both
        internal and external regulatory requirements and compliance protocols


        Required Qualifications, Capabilities, And Skills


        PhD or MSc. in a scientific field (Computer Science, Engineering,
        Operations Research, etc.) plus 6 years or more of experience in
        producing advanced analytics work with an emphasis in optimizationStrong
        proficiency in statistical software packages and data tools, including
        Python and SQLStrong proficiency in Advanced Statistical methods and
        concepts, predictive modeling, time series forecasting, text
        miningStrong proficiency in Data Mining & Visualization (Tableau
        experienced preferred)Experience in Cloud and Big Data platforms such as
        AWS, Snowflake, Hadoop, Hive, Pig, Apache Spark, etc.Strong story
        telling capabilities including communicating complex concepts into
        digestible information to be consumed by audiences of varying levels in
        the organizationStrong commitment to iterative development, with a
        proven ability to engage and update stakeholders bi-weekly or as
        necessary, ensuring alignment, feedback incorporation, and transparency
        throughout the project lifecycle.


        Preferred Qualifications, Capabilities, And Skills


        Financial Service industry experience preferredExperience /
        Understanding of Cloud Storage (Object Stores like S3, Blob; NoSQL like
        Columnar, Graph databases) 


        ABOUT US


        Chase is a leading financial services firm, helping nearly half of
        America’s households and small businesses achieve their financial goals
        through a broad range of financial products. Our mission is to create
        engaged, lifelong relationships and put our customers at the heart of
        everything we do. We also help small businesses, nonprofits and cities
        grow, delivering solutions to solve all their financial needs.


        We offer a competitive total rewards package including base salary
        determined based on the role, experience, skill set, and location. For
        those in eligible roles, discretionary incentive compensation which may
        be awarded in recognition of individual achievements and contributions.
        We also offer a range of benefits and programs to meet employee needs,
        based on eligibility. These benefits include comprehensive health care
        coverage, on-site health and wellness centers, a retirement savings
        plan, backup childcare, tuition reimbursement, mental health support,
        financial coaching and more. Additional details about total compensation
        and benefits will be provided during the hiring process.


        We recognize that our people are our strength and the diverse talents
        they bring to our global workforce are directly linked to our success.
        We are 


        Equal Opportunity Employer/Disability/Veterans


        About The Team


        Our Consumer & Community Banking division serves our Chase customers
        through a range of financial services, including personal banking,
        credit cards, mortgages, auto financing, investment advice, small
        business loans and payment processing. We’re proud to lead the U.S. in
        credit card sales and deposit growth and have the most-used digital
        solutions  all while ranking first in customer satisfaction.
      - >-
        requirementsCollaborate with data engineers and data analysts to
        understand data needs and translate them into technical
        solutionsOptimize Snowflake warehouse configurations and DBT models for
        performance and cost efficiencyTroubleshoot and resolve data pipeline
        issues, ensuring smooth and efficient data flowParticipate in code
        reviews and provide feedback to team members to ensure code quality and
        adherence to best practicesStay updated with the latest developments in
        Snowflake and DBT technologies, and propose and implement innovative
        solutionsDocument data pipelines, transformations, and processes to
        facilitate knowledge sharing and maintain data lineageWork closely with
        cross-functional teams to support data-driven decision-making and
        business objectivesContribute to agile project planning and execution
        related to data engineering tasks and initiatives

        Skills8+ years of experience working on relational databases, SQL, and
        stored proceduresAdvanced working SQL knowledge and experience working
        with relational databases, query authoring (SQL) as well as working
        familiarity with a variety of databases such as DBT and Snowflake for
        Data WarehouseAt least 3+ years of experience working on Snowflake,
        building data warehousing solutions, dealing with slowly changing
        dimensions as wellHighly preferred to have prior experience in creating
        DW models on SAP ECC, Salesforce systemsAt least 3+ years of experience
        in developing and deploying data transformations using DBT, including
        creating/debugging macros5+ experience in supporting end-to-end data
        model build and maintenance, including testing/UATBuild, maintain and
        test data pipelines using cloud ETL/ELT tools, preferably SnapLogicPrior
        experience in working on SAP HANA
  - source_sentence: >-
      Marketing effectiveness measurement, content performance analysis, A/B
      testing for social media
    sentences:
      - >-
        requirements, prioritize tasks, and deliverintegrated
        solutions.Documentation and Best Practices: Document design decisions,
        implementation details, and bestpractices for data engineering
        processes, ensuring knowledge sharing and continuous improvementwithin
        the team.Qualifications:Bachelor's or Master's degree in Computer
        Science, Engineering, or related field.Proven experience as a Data
        Engineer, preferably with specialization in handling image data.Strong
        proficiency in cloud computing platforms (e.g., AWS, Azure, Google
        Cloud) and related services(e.g., S3, EC2, Lambda,
        Kubernetes).Experience with data engineering tools like DataBrick,
        Snowflake, Glue etc.Proficiency in programming languages commonly used
        in data engineering (e.g., Python, Scala, Java) andfamiliarity with
        relevant libraries and frameworks (e.g., Apache Spark, TensorFlow,
        OpenCV).Solid understanding of data modeling, schema design, and
        database technologies (e.g., SQL, NoSQL,data warehouses).Familiarity
        with DevOps practices, CI/CD pipelines, and containerization
        technologies (e.g., Docker,Kubernetes).Strong problem-solving skills,
        analytical thinking, and attention to detail.Excellent communication and
        collaboration skills, with the ability to work effectively in a
        cross-functionalteam environment.
      - >-
        Hi All,

        This is Nithya from TOPSYSIT, We have a job requirement for Data
        Scientist with GenAI. If anyone interested please send me your updated
        resume along with contact details to [email protected]

        Any Visa is Fine on W2 except H1B ,OPT and CPT.If GC holders who can
        share PPN along with proper documentation are eligible

        Job Title Data Scientist with GenAILocation: Plano, TX-OnsiteEXP: 10
        Years Description:Competencies: SQL, Natural Language Processing (NLP),
        Python, PySpark/ApacheSpark, Databricks.Python libraries: Numpy, Pandas,
        SK-Learn, Matplotlib, Tensorflow, PyTorch.Deep Learning: ANN, RNN, LSTM,
        CNN, Computer vision.NLP: NLTK, Word Embedding, BOW, TF-IDF, World2Vec,
        BERT.Framework: Flask or similar.

        Thanks & Regards,Nithya Kandee:[email protected]:678-899-6898
      - >-
        Skills:5+ years of marketing or business analytics experience with
        synthesizing large-scale data sets to generate insights and
        recommendations.5+ years of working experience using SQL, Excel,
        Tableau, and/or Power B. R & Python knowledge are
        preferred.Understanding of the data science models used for measuring
        marketing incrementality, e.g. multi-touch attribution, marketing mix
        models, causal inference, time-series regression, match market test,
        etc....Understanding of the full-funnel cross-platform marketing and
        media landscape and experience evolving analytics and measurement
        capabilities.Flexibility in priority shifts and fast iterations/agile
        working environment.Strong problem-solving skills, and ability to
        structure problems into an analytics plan.

        Pride Global offers eligible employee’s comprehensive healthcare
        coverage (medical, dental, and vision plans), supplemental coverage
        (accident insurance, critical illness insurance and hospital indemnity),
        401(k)-retirement savings, life & disability insurance, an employee
        assistance program, legal support, auto, home insurance, pet insurance
        and employee discounts with preferred vendors.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - cosine_accuracy
model-index:
  - name: SentenceTransformer based on sentence-transformers/all-distilroberta-v1
    results:
      - task:
          type: triplet
          name: Triplet
        dataset:
          name: ai job validation
          type: ai-job-validation
        metrics:
          - type: cosine_accuracy
            value: 0.9875
            name: Cosine Accuracy
      - task:
          type: triplet
          name: Triplet
        dataset:
          name: ai job test
          type: ai-job-test
        metrics:
          - type: cosine_accuracy
            value: 0.975609756097561
            name: Cosine Accuracy

SentenceTransformer based on sentence-transformers/all-distilroberta-v1

This is a sentence-transformers model finetuned from sentence-transformers/all-distilroberta-v1. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: RobertaModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Vishnu7796/my-finetuned-model")
# Run inference
sentences = [
    'Marketing effectiveness measurement, content performance analysis, A/B testing for social media',
    'Skills:5+ years of marketing or business analytics experience with synthesizing large-scale data sets to generate insights and recommendations.5+ years of working experience using SQL, Excel, Tableau, and/or Power B. R & Python knowledge are preferred.Understanding of the data science models used for measuring marketing incrementality, e.g. multi-touch attribution, marketing mix models, causal inference, time-series regression, match market test, etc....Understanding of the full-funnel cross-platform marketing and media landscape and experience evolving analytics and measurement capabilities.Flexibility in priority shifts and fast iterations/agile working environment.Strong problem-solving skills, and ability to structure problems into an analytics plan.\nPride Global offers eligible employee’s comprehensive healthcare coverage (medical, dental, and vision plans), supplemental coverage (accident insurance, critical illness insurance and hospital indemnity), 401(k)-retirement savings, life & disability insurance, an employee assistance program, legal support, auto, home insurance, pet insurance and employee discounts with preferred vendors.',
    'Hi All,\nThis is Nithya from TOPSYSIT, We have a job requirement for Data Scientist with GenAI. If anyone interested please send me your updated resume along with contact details to [email protected]\nAny Visa is Fine on W2 except H1B ,OPT and CPT.If GC holders who can share PPN along with proper documentation are eligible\nJob Title Data Scientist with GenAILocation: Plano, TX-OnsiteEXP: 10 Years Description:Competencies: SQL, Natural Language Processing (NLP), Python, PySpark/ApacheSpark, Databricks.Python libraries: Numpy, Pandas, SK-Learn, Matplotlib, Tensorflow, PyTorch.Deep Learning: ANN, RNN, LSTM, CNN, Computer vision.NLP: NLTK, Word Embedding, BOW, TF-IDF, World2Vec, BERT.Framework: Flask or similar.\nThanks & Regards,Nithya Kandee:[email protected]:678-899-6898',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Triplet

Metric ai-job-validation ai-job-test
cosine_accuracy 0.9875 0.9756

Training Details

Training Dataset

Unnamed Dataset

  • Size: 647 training samples
  • Columns: query, job_description_pos, and job_description_neg
  • Approximate statistics based on the first 647 samples:
    query job_description_pos job_description_neg
    type string string string
    details
    • min: 8 tokens
    • mean: 15.05 tokens
    • max: 40 tokens
    • min: 7 tokens
    • mean: 350.34 tokens
    • max: 512 tokens
    • min: 7 tokens
    • mean: 352.82 tokens
    • max: 512 tokens
  • Samples:
    query job_description_pos job_description_neg
    healthcare data analytics, pregnancy identification algorithms, causal modeling techniques experience in using, manipulating, and extracting insights from healthcare data with a particular focus on using machine learning with claims data. The applicant will be driven by curiosity, collaborating with a cross-functional team of Product Managers, Software Engineers, and Data Analysts.

    Responsibilities

    Apply data science, machine learning, and healthcare domain expertise to advance and oversee Lucina’s pregnancy identification and risk-scoring algorithms.Analyze healthcare data to study patterns of care and patient conditions which correlate to specific outcomes.Collaborate on clinical committee research and development work.Complete ad hoc analyses and reports from internal or external customers prioritized by management throughout the year.

    Qualifications

    Degree or practical experience in Applied Math, Statistics, Engineering, Information Management with 3 or more years of data analytics experience, Masters degree a plus.Experience manipulating and analyzing healthcare dat...
    Experience of Delta Lake, DWH, Data Integration, Cloud, Design and Data Modelling. Proficient in developing programs in Python and SQLExperience with Data warehouse Dimensional data modeling. Working with event based/streaming technologies to ingest and process data. Working with structured, semi structured and unstructured data. Optimize Databricks jobs for performance and scalability to handle big data workloads. Monitor and troubleshoot Databricks jobs, identify and resolve issues or bottlenecks. Implement best practices for data management, security, and governance within the Databricks environment. Experience designing and developing Enterprise Data Warehouse solutions. Proficient writing SQL queries and programming including stored procedures and reverse engineering existing process. Perform code reviews to ensure fit to requirements, optimal execution patterns and adherence to established standards.

    Requirements:

    You are:

    Minimum 9+ years of experience is required. 5+ years...
    Data Engineer Python Azure API integration experience preferred but not required.
    Must-Have Skills:10+ years of total IT experience required.of 4 years of proven and relevant experience in a similar Data Engineer role and/or Python Dev role.Strong proficiency in Python programming is essential for data manipulation, pipeline development, and integration tasks.In-depth knowledge of SQL for database querying, data manipulation, and performance optimization.Experience working with RESTful APIs and integrating data from external sources using API calls.Azure: Proficiency in working with Microsoft Azure cloud platform, including services like Azure Data Factory, Azure Databricks, and Azure Storage.
    requirements;Research & implement new data products or capabilitiesAutomate data visualization and reporting capabilities that empower users (both internal and external) to access data on their own thereby improving quality, accuracy and speedSynthesize raw data into actionable insights to drive business results, identify key trends and opportunities for business teams and report the findings in a simple, compelling wayEvaluate and approve additional data partners or data assets to be utilized for identity resolution, targeting or measurementEnhance PulsePoint's data reporting and insights generation capability by publishing internal reports about Health dataAct as the “Subject Matter Expert” to help internal teams understand the capabilities of our platforms, how to implement & troubleshoot
    RequirementsWhat are the ‘must haves’ we’re looking for?Minimum 3-5 years of relevant experience in:Creating SQL queries from scratch using real business data;Highly proficient knowledge of Excel (...
    Data Engineer big data technologies, cloud data warehousing, real-time data streaming experience in machine learning, distributed microservices, and full stack systems Utilize programming languages like Java, Scala, Python and Open Source RDBMS and NoSQL databases and Cloud based data warehousing services such as Redshift and Snowflake Share your passion for staying on top of tech trends, experimenting with and learning new technologies, participating in internal & external technology communities, and mentoring other members of the engineering community Collaborate with digital product managers, and deliver robust cloud-based solutions that drive powerful experiences to help millions of Americans achieve financial empowerment Perform unit tests and conduct reviews with other team members to make sure your code is rigorously designed, elegantly coded, and effectively tuned for performance

    Basic Qualifications:

    Bachelor’s Degree At least 2 years of experience in application development (Internship experience does not apply) At least 1 year of experience in big d...
    requirements of analyses and reports.Transform requirements into actionable, high-quality deliverables.Perform periodic and ad-hoc operations data analysis to measure performance and conduct root cause analysis for Claims, FRU, G&A, Provider and UM data.Compile, analyze and provide reporting that identifies and defines actionable information or recommends possible solutions for corrective actions.Partner with other Operations areas as needed to provide technical and other support in the development, delivery, maintenance, and enhancement of analytical reports and analyses.Collaborate with Operations Tower Leaders in identifying and recommending operational performance metrics; map metrics against targets and the company’s operational plans and tactical/strategic goals to ensure alignment and focus.Serve as a liaison with peers in other departments to ensure data integrity.Code and schedule reports using customer business requirements from Claims, FRU, G&A, Provider and UM data.

    Princi...
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 80 evaluation samples
  • Columns: query, job_description_pos, and job_description_neg
  • Approximate statistics based on the first 80 samples:
    query job_description_pos job_description_neg
    type string string string
    details
    • min: 8 tokens
    • mean: 14.9 tokens
    • max: 25 tokens
    • min: 14 tokens
    • mean: 354.31 tokens
    • max: 512 tokens
    • min: 31 tokens
    • mean: 334.05 tokens
    • max: 512 tokens
  • Samples:
    query job_description_pos job_description_neg
    Data analysis, operations reporting, SQL expertise requirements, determine technical issues, and design reports to meet data analysis needsDeveloping and maintaining web-based dashboards for real-time reporting of key performance indicators for Operations. Dashboards must be simple to use, easy to understand, and accurate.Maintenance of current managerial reports and development of new reportsDevelop and maintain reporting playbook and change logOther duties in the PUA department as assigned

    What YOU Will Bring To C&F

    Solid analytical and problem solving skillsIntuitive, data-oriented with a creative, solutions-based approachAbility to manage time, multi-task and prioritizes multiple assignments effectivelyAbility to work independently and as part of a teamAble to recognize and analyze business and data issues with minimal supervision, ability to escalate when necessaryAble to identify cause and effect relationships in data and work process flows

    Requirements

    3 years in an Analyst role is requiredA Bachelor’s degree in associated f...
    experience in data engineering, software engineering, data analytics, or machine learning.Strong expertise working with one or more cloud data platforms (Snowflake, Sagemaker, Databricks, etc.)Experience managing Snowflake infrastructure with terraform.Experience building batch, near real-time, and real-time data integrations with multiple sources including event streams, APIs, relational databases, noSQL databases, graph databases, document stores, and cloud object stores.Strong ability to debug, write, and optimize SQL queries in dbt. Experience with dbt is a must.Strong programming experience in one or more modern programming languages (Python, Clojure, Scala, Java, etc.)Experience working with both structured and semi-structured data.Experience with the full software development lifecycle including requirements gathering, design, implementation, testing, deployment, and iteration.Strong understanding of CI/CD principles.Strong ability to document, diagram, and deliver detailed pres...
    AWS Sagemaker, ML Model Deployment, Feedback Loop Automation Qualifications

    AWS tools and solutions including Sagemaker, Redshift, AthenaExperience with Machine learning libraries such as PyTorchHands-on experience with designing, developing and deploying workflows with ML models with feedback loops; Uses Bitbucket workflows and has experience with CI/CDDeep experience in at least two of the following languages: PySpark/Spark, Python, CWorking knowledge of AI/ML algorithms. Large language models (LLMs), Retrieval-augmented generation (RAN), Clustering algorithms (such as K-Means), Binary classifiers (such as XGBoost)High level of self-starter, learning, and initiative behaviors Preferred:Background as a software engineer and experience as a data scientistFeatures Stores

    Why Teaching Strategies

    At Teaching Strategies, our solutions and services are only as strong as the teams that create them. By bringing passion, dedication, and creativity to your job every day, there's no telling what you can do and where you can go! We provide a competitive...
    requirements and metrics.
    Provide training and support to end-users on data quality best practices and tools.
    Develop and maintain documentation related to data quality processes.

    Education Qualification:

    Bachelor's degree in a related field such as Data Science, Computer Science, or Information Systems.

    Required Skills:

    Experience working as a BA/Data Analyst in a Data warehouse/Data governance platform.
    Strong analytical and problem-solving skills.
    Proficiency in SQL, data analysis, and data visualization tools.
    Critical thinking.
    Ability to understand and examine complex datasets.
    Ability to interpret Data quality results and metrics.

    Desired Skills:

    Knowledge of Data quality standards and processes.
    Proven experience in a Data Quality Analyst or similar role.
    Experience with data quality tools such as Informatica, PowerCurve, or Collibra DQ is preferred.
    Certifications in data management or quality assurance (e.g.
    Certified Data Management Professional, Certified Quality ...
    Financial analysis, process re-engineering, client relationship management skills:
    BA/BS degree in finance-related field and/or 2+ years working in finance or related field Strong working knowledge of Microsoft Office (especially Excel) Ability to work in a fast-paced environment and attention to detail. This role includes reviews and reconciliation of financial information.
    General Position Summary
    The Business Analyst performs professional duties related to the review, assessment and development of business systems and processes as well as new client requirements. This includes reviewing existing processes to develop strong QA procedures as well as maximizing review efficiencies and internal controls through process re-engineering. The Business Analyst will assist with the development of seamless solutions for unique requirements of new clients, delivered and implemented on time and within scope. This role will ensure that all activity, reconciliation, reporting, and analysis is carried out in an effective, timely and accurate manner and will look for cont...
    Skills / Experience:Required: Proficiency with Python, pyTorch, Linux, Docker, Kubernetes, Jupyter. Expertise in Deep Learning, Transformers, Natural Language Processing, Large Language Models
    Preferred: Experience with genomics data, molecular genetics. Distributed computing tools like Ray, Dask, Spark.
    Thanks & RegardsBharat Priyadarshan GuntiHead of Recruitment & OperationsStellite Works LLC4841 W Stonegate Circle Lake Orion MI - 48359Contact: 313 221 [email protected]
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • learning_rate: 2e-05
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step ai-job-validation_cosine_accuracy ai-job-test_cosine_accuracy
0 0 0.85 -
1.0 41 0.9875 0.9756

Framework Versions

  • Python: 3.11.12
  • Sentence Transformers: 3.3.1
  • Transformers: 4.48.0
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.5.2
  • Datasets: 2.14.4
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}