Artuso et al., 2024 - Google Patents

Binbert: Binary code understanding with a fine-tunable and execution-aware transformer

Artuso et al., 2024

View PDF
Document ID
9936072531829048931
Author
Artuso F
Mormando M
Di Luna G
Querzoni L
Publication year
Publication venue
IEEE Transactions on Dependable and Secure Computing

External Links

Snippet

A recent trend in binary code analysis promotes the use of neural solutions based on instruction embedding models. An instruction embedding model is a neural network that transforms assembly instructions into embedding vectors. If the embedding network is able …
Continue reading at arxiv.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/562Static detection
    • G06F21/563Static detection by source code analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformations of program code
    • G06F8/41Compilation
    • G06F8/44Encoding
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2765Recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/30Arrangements for executing machine-instructions, e.g. instruction decode
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computer systems utilising knowledge based models
    • G06N5/02Knowledge representation
    • G06N5/022Knowledge engineering, knowledge acquisition
    • G06N5/025Extracting rules from data
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/70Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer
    • G06F21/71Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure computing or processing of information
    • G06F21/77Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure computing or processing of information in smart cards
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/10Protecting distributed programs or content, e.g. vending or licensing of copyrighted material
    • G06F21/12Protecting executable software
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass
    • G06N99/005Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • G06F21/6254Protecting personal data, e.g. for financial or medical purposes by anonymising data, e.g. decorrelating personal data from the owner's identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computer systems utilising knowledge based models
    • G06N5/04Inference methods or devices
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6267Classification techniques
    • G06K9/6279Classification techniques relating to the number of classes

Similar Documents

Publication Publication Date Title
VenkataKeerthy et al. Ir2vec: Llvm ir based scalable program embeddings
Jeon et al. AutoVAS: An automated vulnerability analysis system with a deep learning approach
Shen et al. A survey of automatic software vulnerability detection, program repair, and defect prediction techniques
Rahimian et al. Bincomp: A stratified approach to compiler provenance attribution
Reynolds et al. Scaling up DPLL (T) string solvers using context-dependent simplification
CN113010209A (en) Binary code similarity comparison technology for resisting compiling difference
Alrabaee et al. On leveraging coding habits for effective binary authorship attribution
Ji et al. Vestige: Identifying binary code provenance for vulnerability detection
Patrick-Evans et al. XFL: naming functions in binaries with extreme multi-label learning
Xue et al. Hecate: Automated customization of program and communication features to reduce attack surfaces
Xiong et al. HexT5: Unified Pre-Training for Stripped Binary Code Information Inference
Artuso et al. Binbert: Binary code understanding with a fine-tunable and execution-aware transformer
Hojjat et al. On strings in software model checking
Mi et al. An automated vulnerability detection framework for smart contracts
Şahin Malware detection using transformers-based model GPT-2
Liu et al. Universal representation for code
Semasaba et al. An empirical evaluation of deep learning‐based source code vulnerability detection: Representation versus models
Pujar et al. Analyzing source code vulnerabilities in the D2A dataset with ML ensembles and C-BERT
Khodadadi et al. Hymo: Vulnerability detection in smart contracts using a novel multi-modal hybrid model
Zhang et al. InsFuzz: Fuzzing binaries with location sensitivity
Xia et al. Binary code similarity analysis based on naming function and common vector space
Hanifi et al. Software Vulnerability Prediction Knowledge Transferring Between Programming Languages
Wang et al. COBRA-GCN: Contrastive Learning to Optimize Binary Representation Analysis with Graph Convolutional Networks
Demidov et al. Application of siamese neural networks for fast vulnerability detection in mips executable code
Liu et al. DL4SC: a novel deep learning-based vulnerability detection framework for smart contracts