URLs

Information about URLs.

URLs doesn't only represent information by themselves, but also can give
contextual information about files and other elements on VT.

Different URL calls may return different URL-related objects that we list here.

Object Attributes

  • categories: <dictionary> they key is the partner who categorised the URL and the value is the URL's category according to that partner.
  • favicon : <dictionary> dictionary including difference hash and md5 hash of the URL's favicon. Only returned in premium API.
    • dhash: <string> difference hash
    • raw_md5: <string> favicon's MD5 hash.
  • first_submission_date: <integer> UTC timestamp of the date where the URL was first submitted to Google Threat Intelligence.
  • gti_assessment: <dictionary> containing the following fields:
    • verdict: <dictionary>. The value property can have any of these values:
      • VERDICT_BENIGN: the entity is considered harmless.
      • VERDICT_UNDETECTED: no immediate evidence of malicious intent.
      • VERDICT_SUSPICIOUS: possible malicious activity detected, requires further investigation.
      • VERDICT_MALICIOUS: high confidence that the entity poses a threat.
      • VERDICT_UNKNOWN: we were not able to generate a verdict for this entity.
    • severity: <dictionary>. The value property can have any of these values:
      • SEVERITY_NONE: this is the level assigned to entities with non-malicious verdict.
      • SEVERITY_LOW: the threat likely has a minor impact but should still be monitored
      • SEVERITY_MEDIUM: indicates a potential threat that warrants attention.
      • SEVERITY_HIGH: immediate action is recommended; the threat could have a critical impact
      • SEVERITY_UNKNOWN: not enough data to assess a severity.
    • description: <string> a human readable description of the factors contributing to the verdict and severity classification.
    • threat_score: <int> the Google Threat Intelligence score is a function of the Verdict and Severity, and leverages additional internal factors to generate the score. Valid values go from 0 to 100.
    • contributing_factors: <dictionary> the signals that contributed to the verdict and severity classification.
      • mandiant_analyst_benign: <bool> the indicator was determined as benign by a Google Threat Intelligence analyst and likely poses no threat.
      • mandiant_analyst_malicious: <bool> it was determined as malicious by a Google Threat Intelligence analyst.
      • google_malware_analysis: <bool> it was detected by Google Threat Intelligence's malware analysis.
      • google_botnet_emulation: <bool> it was detected by Google Threat Intelligence's botnet analysis.
      • google_mobile_malware_analysis: <bool> it was detected by Google Threat Intelligence's mobile malware analysis.
      • google_malware_similarity: <bool> it was detected by Google Threat Intelligence's malware analysis.
      • google_malware_analysis_auto: <bool> it was detected by Google Threat Intelligence's malware analysis.
      • mandiant_association_report: <bool> it is associated with a Google Threat Intelligence Intelligence Report.
      • mandiant_association_actor: <bool> it is associated with a tracked Google Threat Intelligence threat actor.
      • mandiant_association_malware: <bool> it is associated with a tracked Google Threat Intelligence malware family
      • mandiant_confidence_score: <int> the Google Threat Intelligence confidence score of the indicator.
      • mandiant_domain_hijack: <bool> the domain was recently determined as malicious by a Google Threat Intelligence analyst.
      • mandiant_osint: <bool> it is considered widespread.
      • safebrowsing_verdict: <bool> Google Safebrowsing verdict.
      • gavs_detections: <int> number of detections by Google’s spam and threat filtering engines.
      • gavs_categories: <list of strings> known threat categories.
      • normalised_categories: <list of strings> known threat categories.
      • legitimate_software: <bool> the indicator is benign. It is associated with a well-known and trusted software distributor and likely poses no threat.
      • matched_malicious_yara: <bool> matches YARA rules.
      • malicious_sandbox_verdict: <bool> it was detected by sandbox analysis, indicating suspicious behavior.
      • associated_reference: <bool> it appears in public sources.
      • associated_malware_configuration: <bool> contains known malware configurations.
      • associated_actor: <bool> it is associated with a community threat actor.
      • high_severity_related_files: <bool> related files are marked as malicious (high severity).
      • medium_severity_related_files: <bool> related files are marked as malicious (medium severity).
      • low_severity_related_files: <bool> related files are marked as malicious (low severity).
  • html_meta: <dictionary> containing all meta tags (only for URLs downloading a HTML). Keys are the meta tag name and value is a list containing all values of that meta tag.
  • last_analysis_date: <integer> UTC timestamp representing last time the URL was scanned.
  • last_analysis_results: <dictionary> result from URL scanners. dict with scanner name as key and a dict with notes/result from that scanner as value.
    • category: <string> normalized result. can be:
      • "harmless" (site is not malicious),
      • "undetected" (scanner has no opinion about this site),
      • "suspicious" (scanner thinks the site is suspicious),
      • "malicious" (scanner thinks the site is malicious).
    • engine_name: <string> complete name of the URL scanning service.
    • method: <string> type of service given by that URL scanning service (i.e. "blacklist").
    • result: <string> raw value returned by the URL scanner ("clean", "malicious", "suspicious", "phishing"). It may vary from scanner to scanner, hence the need for the "category" field for normalisation.
  • last_analysis_stats: <dictionary> number of different results from this scans.
    • harmless: <integer> number of reports saying that is harmless.
    • malicious: <integer> number of reports saying that is malicious.
    • suspicious: <integer> number of reports saying that is suspicious.
    • timeout: <integer> number of timeouts when checking this URL.
    • undetected: <integer> number of reports saying that is undetected.
  • last_final_url: <string> if the original URL redirects, where does it end.
  • last_http_response_code: <integer> HTTP response code of the last response.
  • last_http_response_content_length: <integer> length in bytes of the content received.
  • last_http_response_content_sha256: <string> URL response body's SHA256 hash.
  • last_http_response_cookies: <dictionary> containing the website's cookies.
  • last_http_response_headers: <dictionary> containing headers and values of last HTTP response.
  • last_modification_date: <integer> UTC timestamp representing last modification date.
  • last_submission_date: <integer> UTC timestamp representing last time it was sent to be analysed.
  • outgoing_links: <list of strings> containing links to different domains.
  • redirection_chain: <list of strings> history of redirections followed when visiting a given URL. The last URL of the chain is not included in the list since it is available at the last_final_url attribute.
  • reputation: <integer> value of votes from VT community.
  • tags: <list of strings> tags.
  • targeted_brand: <dictionary> targeted brand info extracted from phishing engines.
  • times_submitted: <integer> number of times that URL has been checked.
  • title: <string> webpage title.
  • total_votes: <dictionary> containing the number of positive ("harmless") and negative ("malicious") votes received from VT community.
    • harmless: <integer> number of positive votes.
    • malicious: <integer> number of negative votes.
  • trackers: <dictionary> contains all found trackers in that URL in a historical manner. Every key is a tracker name, which is a dictionary containing:
    • id: <string> tracker ID, if available.
    • timestamp: <integer> tracker ingestion date as UNIX timestamp.
    • url: <string> tracker script URL.
  • url: <string> original URL to be scanned.

Relationships

In addition to the previously described attributes, URL objects contain relationships with other objects in our dataset that can be retrieved as explained in the Relationships section. The available relationships are described bellow.

RelationshipDescriptionAccessibilityReturn object type
analysesAnalyses for the URL.VT Enterprise users only.List of Analyses.
commentsCommunity posted comments about the URL.Everyone.List of Comments.
communicating_filesFiles that communicate with a given URL when they're executed.VT Enterprise users only.List of Files.
contacted_domainsDomains from which the URL loads some kind of resource.VT Enterprise users only.List of Domains.
contacted_ipsIPs from which the URL loads some kind of resource.VT Enterprise users only.List of IP addresses.
downloaded_filesFiles downloaded from the URL.VT Enterprise users only.List of Files.
graphsGraphs including the URL.Everyone.List of Graphs.
last_serving_ip_addressLast IP address that served the URL.Everyone.A single IP address.
network_locationDomain or IP for the URL.Everyone.A single IP address or Domain.
referrer_filesFiles containing the URL.VT Enterprise users only.A list of Files.
referrer_urlsURLs referring the URL.VT Enterprise users only.A list of URLs.
redirecting_urlsURLs that redirected to the given URL.VT Enterprise users only.A list of URLs.
redirects_toURLs that the URL redirects to.VT Enterprise users only.A list of URLs.
related_commentsCommunity posted comments in the URL's related objects.Everyone.A list of Comments.
related_referencesReferences related to the URL.VT Enterprise users only.A list of References.
related_threat_actorsThreat actors related to the URL.VT Enterprise users only.A list of Threat Actors.
submissionsURL's submissions.VT Enterprise users only.A list of Submissions.