Skip to content
This repository has been archived by the owner on Nov 6, 2023. It is now read-only.

Regexps matching host that is not in targets #12297

Closed
20 of 77 tasks
RReverser opened this issue Aug 31, 2017 · 4 comments
Closed
20 of 77 tasks

Regexps matching host that is not in targets #12297

RReverser opened this issue Aug 31, 2017 · 4 comments
Assignees

Comments

@RReverser
Copy link
Contributor

RReverser commented Aug 31, 2017

These cases were found as part of #12294. Either regexp got outdated or host was forgotten to be added to <target /> list, either way they might be functioning incorrectly:

  • ABC-Online.xml: ^http://(?:www\.)?abccomercial\.com/ matches abccomercial.com, www.abccomercial.com which are not in targets shop.abc.net.au, abccommercial.com, *.abccommercial.com
  • ARM.xml: ^http://(?:www\.)?arm\.com/(?=css/|images/) matches arm.com which are not in targets *.arm.com
  • AkademikerForsakring.se.xml: ^http://(?:www\.)?akademikerf[oö]rs[aä]kring\.se/ matches akademikerforsäkring.se, akademikerförsakring.se, www.akademikerforsäkring.se, www.akademikerförsakring.se which are not in targets akademikerforsakring.se, www.akademikerforsakring.se, akademikerförsäkring.se, www.akademikerförsäkring.se
  • AlJazeera.com.xml: ^http://(m\.)?aljazeera\.com/ matches m.aljazeera.com which are not in targets aljazeera.com, www.aljazeera.com, staff.balkans.aljazeera.com, interactive.aljazeera.com, leaderssummit.aljazeera.com, staff.liberties.aljazeera.com, mediaview.aljazeera.com, syhacked.aljazeera.com, iforgot.aljazeera.net
  • Allianz_fur_Cyber-Sicherheit.xml: ^http://(?:www\.)?(allianz-?fuer-?cybersicherheit\.(?:de|org)|cybersicherheits-allianz\.de)/ matches allianzfuercybersicherheit.org, allianzfuer-cybersicherheit.de, allianzfuer-cybersicherheit.org which are not in targets allianzfuercybersicherheit.de, www.allianzfuercybersicherheit.de, allianz-fuer-cybersicherheit.de, www.allianz-fuer-cybersicherheit.de, allianz-fuer-cybersicherheit.org, www.allianz-fuer-cybersicherheit.org, cybersicherheits-allianz.de, www.cybersicherheits-allianz.de
  • Axis_Bank.xml: ^http://(?:www\.)?axisbank\.co(\.in|m)/ matches axisbank.co.in which are not in targets www.axisbank.co.in, axisbank.com, *.axisbank.com
  • BT_Wi-fi.xml: ^http://(my\.|www\.)?btwifi\.co(\.uk|m)/ matches my.btwifi.co.uk which are not in targets btwifi.co.uk, www.btwifi.co.uk, btwifi.com, *.btwifi.com
  • BazzApp.xml: ^http://(?:www\.)?bazapp\.com/ matches bazapp.com, www.bazapp.com which are not in targets bazzapp.com, www.bazzapp.com
  • Branson_Zipline.xml: ^http://(?:css|img)\.bzimages\.com/ matches css.bzimages.com which are not in targets bransonzipline.com, *.bransonzipline.com, img.bzimages.com
  • Captura-Group.xml: ^http://cdn\.measuredvoice\.com/ matches cdn.measuredvoice.com which are not in targets measuredvoice.com, www.measuredvoice.com
  • City-Mail.xml: ^http://(www\.)?citymail\.com/ matches citymail.com, www.citymail.com which are not in targets cityemail.com, www.cityemail.com
  • Clicksor.com.xml: ^http://pub\.clicksor\.net/ matches pub.clicksor.net which are not in targets *.clicksor.com
  • Commission_Junction.xml: ^http://www\.(afcyhf|anrdoezrs|apmebf|awltovhc|commission-junction|dpbolvw|emjcd|ftjcfx|jdoqocy|kdukvh|kqzyfj|pkracv|tkqlhce|tqlkg)\.(com|net)/ matches www.afcyhf.net, www.anrdoezrs.com, www.apmebf.net, www.awltovhc.net, www.commission-junction.net, www.dpbolvw.com, www.emjcd.net, www.ftjcfx.net, www.jdoqocy.net, www.kdukvh.net, www.kqzyfj.net, www.pkracv.net, www.tkqlhce.net, www.tqlkg.net which are not in targets www.afcyhf.com, www.anrdoezrs.net, www.apmebf.com, www.awltovhc.com, www.awxibrm.com, cj.com, *.cj.com, www.commission-junction.com, www.cualbr.com, www.dpbolvw.net, www.emjcd.com, www.ftjcfx.com, www.jdoqocy.com, www.kdukvh.com, www.kqzyfj.com, www.pkracv.com, www.rnsfpw.net, www.qksrv.net, www.qksz.net, www.tkqlhce.com, www.tqlkg.com, www.vofzpwh.com, www.yceml.net
  • Coupons-Inc.xml: ^http://coupouns\.com/ matches coupouns.com which are not in targets coupons.com, *.coupons.com, cdn.cpnscdn.com
  • DealerRater.com.xml: ^http://cdn-static\.dealerrater\.com/ matches cdn-static.dealerrater.com which are not in targets dealerrater.com, www.dealerrater.com
  • E-junkie.xml: ^http://(www\.)?ejunkie\.com/(bb/images/|ecom/|ej//?(?:(?:(?:admin|contact|register)\.php|shop)(?:$|\?|/)|css/|images/|media/)|gc/) matches ejunkie.com, www.ejunkie.com which are not in targets e-junkie.com, *.e-junkie.com
  • E-rewards.com.xml: ^http://(?:www\.)?e-rewards\.com/ matches e-rewards.com which are not in targets www.e-rewards.com
  • EPEAT.xml: ^http://ww2\.epeat\.com/ matches ww2.epeat.com which are not in targets ww2.epeat.net
  • Exhale.xml: ^http://(?:www\.)?4exhale\.org/ matches 4exhale.org, www.4exhale.org which are not in targets exhaleprovoice.org, www.exhaleprovoice.org
  • FORA.tv.xml: ^http://(?:cdn\.|(www\.))?fora\.tv/(fora/clientscript|purchase)/ matches cdn.fora.tv which are not in targets fora.tv, www.fora.tv
  • Factory_Expo_Home_Centers.xml: ^http://(www\.)?(azchampion|expo(?:mobile)?homes|factory(?:directcabins|expo(?:direct|expohomes|mobilehomes)|homesale|selecthomes)|fbhexpo)\.com/ matches factoryexpoexpohomes.com, factoryexpomobilehomes.com, www.factoryexpoexpohomes.com, www.factoryexpomobilehomes.com which are not in targets azchampion.com, *.azchampion.com, cimacorp.net, www.cimacorp.net, expohomes.com, *.expohomes.com, expomobilehomes.com, *.expomobilehomes.com, factorydirectcabins.com, *.factorydirectcabins.com, factoryexpo.net, *.factoryexpo.net, factoryexpodirect.com, *.factoryexpodirect.com, factoryexpohomes.com, *.factoryexpohomes.com, factoryhomesale.com, www.factoryhomesale.com, factoryselecthomes.com, *.factoryselecthomes.com, fbhexpo.com, *.fbhexpo.com
  • Globat.xml: ^http://(secure\.|www\.)?globat\.com(?:443)?/ matches globat.com443, secure.globat.com443, www.globat.com443 which are not in targets globat.com, *.globat.com
  • Halifax.xml: ^http://(?:www\.)?halifax(-online)?\.co\.uk/ matches halifax-online.co.uk which are not in targets halifax.co.uk, *.halifax.co.uk, *.halifax-online.co.uk
  • Herdict.xml: ^http://(?:www\.)?(herdic|nardik)t\.org/ matches nardikt.org, www.nardikt.org which are not in targets herdict.org, *.herdict.org, nardikt.ru, *.nardikt.ru
  • Hi.nl.xml: ^http://(?:www\.)?hi\.nl/ matches hi.nl which are not in targets *.hi.nl
  • HurricaneElectric.xml: ^http://(?:www\.)?h(?:e\.com|urricaneelectric\.net)/ matches www.hurricaneelectric.net which are not in targets he.com, www.he.com, he.net, *.he.net, hurricaneelectric.net
  • ICMail.xml: ^http://(?:www\.)?icmail\.net/ matches icmail.net, www.icmail.net which are not in targets icmail.com, www.icmail.com
  • IndiaMART.xml: ^http://1\.imgimg\.com/ matches 1.imgimg.com which are not in targets *.imimg.com, im.gifbt.com
  • Indiana_State_University.xml: ^http://((?:isuportal|www1?)\.)?indstate\.edu/ matches indstate.edu which are not in targets *.indstate.edu
  • Indymedia.xml: ^http://www\.belgium\.indymedia\.org/ matches www.belgium.indymedia.org which are not in targets indymedia.org, athens.indymedia.org, belgium.indymedia.org, brussels.indymedia.org, bruxelles.indymedia.org, bxl.indymedia.org, de.indymedia.org, grenoble.indymedia.org, *.grenoble.indymedia.org, lille.indymedia.org, linksunten.indymedia.org, www.linksunten.indymedia.org, nantes.indymedia.org, www.nantes.indymedia.org, nyc.indymedia.org, publish.nyc.indymedia.org, www.nyc.indymedia.org, pouget.indymedia.org, publish.indymedia.org, www.indymedia.org, netherlands.indymedia.org
  • Itpol.dk.xml: ^http://(www\.)?it-?pol\.dk/ matches itpol.dk which are not in targets it-pol.dk, www.it-pol.dk
  • Kuruc.info.xml: ^http://(m\.)?(?:kuruc\.info|w\.kuruc\.org)/ matches m.w.kuruc.org which are not in targets kuruc.info, m.kuruc.info, w.kuruc.org
  • MAPS.xml: ^http://store\.maps\.org/ matches store.maps.org which are not in targets maps.org, www.maps.org
  • MVG-mobil.de.xml: ^http://(?:www\.)?mvg-mobile\.de/ matches mvg-mobile.de, www.mvg-mobile.de which are not in targets mvg-mobil.de, www.mvg-mobil.de
  • MediaMind.xml: ^http://(?:www\.)?eyeblasterwiz\.com/ matches eyeblasterwiz.com, www.eyeblasterwiz.com which are not in targets *.mediamind.com, a.pgtb.me
  • My_Aloe_Cleanse.com.xml: ^http://(www\.)?(myaloe|trypto)cleanse\.com/ matches tryptocleanse.com, www.tryptocleanse.com which are not in targets myaloecleanse.com, *.myaloecleanse.com, tryprocleanse.com, *.tryprocleanse.com
  • Mydrive.xml: ^http://(www|static|webdav)\.?mydrive\.ch/ matches wwwmydrive.ch which are not in targets mydrive.ch, www.mydrive.ch, static.mydrive.ch, webdav.mydrive.ch
  • Myftp.utechsoft.com.xml: ^http://(?:www\.)?myftp\.utechsoft\.com/ matches www.myftp.utechsoft.com which are not in targets myftp.utechsoft.com
  • NPO.nl.xml: ^http://(help|mijn|assets\.www)\.npo\.nl/ matches help.npo.nl, mijn.npo.nl which are not in targets assets.www.npo.nl
  • Neobookings.xml: ^http://((?:admin|secure|webservices|www)\.)?neobookings\.com/ matches admin.neobookings.com, secure.neobookings.com, webservices.neobookings.com, www.neobookings.com which are not in targets neobookings.com
  • Netline.com.xml: ^http://my\.netline\.com/ matches my.netline.com which are not in targets ox-d.netline.com
  • Nexaway.xml: ^http://(www\.)?nexaway\.com/ matches nexaway.com, www.nexaway.com which are not in targets nexway.com, *.nexway.com
  • OHeffernan.xml: ^http://(?:www\.)?inchinashop\.com/ matches inchinashop.com, www.inchinashop.com which are not in targets inachinashop.com, www.inachinashop.com
  • OU.edu.xml: ^http://(?:www\.)?ou\.edu/ matches ou.edu which are not in targets *.ou.edu
  • Online.nl.xml: ^http://(?:www\.)?online\.nl/ matches online.nl which are not in targets *.online.nl
  • Orion_Magazine.org.xml: ^http://(www\.)?orionmagazine\.com/ matches orionmagazine.com, www.orionmagazine.com which are not in targets orionmagazine.org, www.orionmagazine.org
  • PAL_cdn.com.xml: ^http://(image|cs)s\.palcdn\.com/ matches css.palcdn.com which are not in targets images.palcdn.com
  • PBwiki.xml: ^http://(files|my|(pb-api)?docs|secure|vs1|www)\.pbworks\.com/ matches files.pbworks.com, my.pbworks.com, docs.pbworks.com, pb-apidocs.pbworks.com, secure.pbworks.com, vs1.pbworks.com, www.pbworks.com which are not in targets *.pbwiki.com
  • PamConsult.xml: ^http://(www\.)?(pamconsult|pamela|pamfax)\.(biz|com)/ matches pamconsult.biz, pamela.com, pamfax.com, www.pamconsult.biz, www.pamela.com, www.pamfax.com which are not in targets pamconsult.com, www.pamconsult.com, pamela.biz, www.pamela.biz, pamfax.biz, s.pamfax.biz, www.pamfax.biz
  • Parse.ly.xml: ^http://(?:www\.)?parse\.ly/ matches parse.ly, www.parse.ly which are not in targets parsely.com, *.parsely.com
  • Peer1.ca.xml: ^http://(www\.)?partners\.peer1\.ca/ matches partners.peer1.ca, www.partners.peer1.ca which are not in targets *.peer1.com
  • Record-Store-Day.xml: ^http://(?:www\.)?recordstoreday(?:\.tuneportals)\.com/ matches www.recordstoreday.tuneportals.com which are not in targets recordstoreday.com, www.recordstoreday.com, recordstoreday.tuneportals.com
  • RedIRIS.es.xml: ^http://(?:www\.)?rediris\.(es|net)/ matches rediris.net which are not in targets pki.irisgrid.es, rediris.es, *.rediris.es, www.rediris.net
  • Red_Ferret.xml: ^http://(www\.)?theferret\.net/ matches theferret.net, www.theferret.net which are not in targets redferret.net, *.redferret.net
  • Sears.com.xml: ^http://(?:www\.)?sears\.com/(?=favicon\.ico|shc/s/UserLogonFormView(?:$|[?/])) matches sears.com which are not in targets *.sears.com
  • SemiAccurate.xml: ^http://(www\.)?s(?:emiaccurate\.com|tonearch\.net)/ matches stonearch.net which are not in targets semiaccurate.com, *.semiaccurate.com, www.stonearch.net
  • Sinica.edu.tw.xml: ^http://(?:www\.)?ascc\.sinica\.edu/ matches ascc.sinica.edu, www.ascc.sinica.edu which are not in targets *.sinica.edu.tw
  • Sirportly.xml: ^http://(app\.|www\.)?sirportly\.com/ matches app.sirportly.com which are not in targets sirportly.com, www.sirportly.com
  • Space2u.xml: ^http://(owa\.|webmail2\.|www\.)?space2u\.com/ matches owa.space2u.com, webmail2.space2u.com which are not in targets space2u.com, www.space2u.com
  • Svenskaspel.se.xml: ^http://www\.svenskaspel\.se/ matches www.svenskaspel.se which are not in targets svenskaspel.se
  • Texas_A_and_M_University.xml: ^http://(?:cllacdn|liberalarts)\.tamu\.edu(?:443)?/ matches cllacdn.tamu.edu443, liberalarts.tamu.edu443 which are not in targets tamu.edu, *.tamu.edu
  • The-Republic.xml: ^http://cdn\.therepublic\.com/ matches cdn.therepublic.com which are not in targets cdn1.therepublic.com
  • Traction-Digital.com.xml: ^http://cdn\.transaction-digital\.com/ matches cdn.transaction-digital.com which are not in targets *.traction-digital.com
  • Trusted_Reviews.xml: ^http://(?:secure\.|www\.)?trustedreviews\.com/ matches secure.trustedreviews.com which are not in targets trustedreviews.com, static.trustedreviews.com, www.trustedreviews.com
  • UNC.edu.xml: ^http://((?:ccinfo|ccpa|cmsp|connectcarolina(?:www\.)?cs|dir|directory|diversity|financeadmin|help|hr|infoporte|its|itsapps|its-commons|blogs\.lib|(?:www\.)?library|my|obi|(?:www\.)?onecard|onyen|(?:www\.)?pid|research|search|selfservice|(?:cs|events|global|help|its2|its-commons)\.sites|sso|diversity\.web|webservices|www)\.)?unc\.edu/ matches unc.edu which are not in targets *.unc.edu
  • Umea_University.xml: ^http://((?:www\.cambro|cas|intra\.ub|www)\.?)umu\.se/ matches www.cambroumu.se which are not in targets umu.se, *.umu.se
  • Uni.Lu.xml: ^http://(hpc|piwik)\.uni\.lu/ matches hpc.uni.lu, piwik.uni.lu which are not in targets *.uni.lui
  • University_of_Houston.xml: ^http://(?:www\.)?uh\.edu/ matches uh.edu which are not in targets *.uh.edu
  • Usenet.nl.xml: ^http://(?:www\.)?usenet\.nl/ matches usenet.nl which are not in targets *.usenet.nl
  • VBSEO.xml: ^http://(?:cdn\.|(www\.))?vbseo\.com/ matches cdn.vbseo.com which are not in targets vbseo.com, www.vbseo.com
  • VCE.com.xml: ^http://(static\.|www\.)?vce\.com/ matches static.vce.com which are not in targets vce.com, www.vce.com
  • VIPserv.org.xml: ^http://((?:webmail|www)\.)?vipserv\.org/ matches webmail.vipserv.org which are not in targets vipserv.org, www.vipserv.org, x14.eu
  • VirtualTourist.xml: ^http://(?:www\.)?virtualtourist\.com/ajax/ matches virtualtourist.com which are not in targets *.vtourist.com, *.virtualtourist.com
  • WebMD.xml: ^http://ls\.turn\.com/ matches ls.turn.com which are not in targets *.webmd.com
  • Whisper-Gifts.xml: ^http://www\.whispergifts/(media/|whisper/(?:login|manage|signup)) matches www.whispergifts which are not in targets whispergifts.com, www.whispergifts.com
  • YouVersion.xml: ^http?://(?:www\.)?(bible|yourversion)\.com/ matches htt://bible.com/*, htt://yourversion.com/*, htt://www.bible.com/*, htt://www.yourversion.com/* which don't look like URLs`
  • YouVersion.xml: ^http?://(?:www\.)?(bible|yourversion)\.com/ matches www.yourversion.com which are not in targets bible.com, *.bible.com, *.biblesociety.co.za, youversion.com, *.youversion.com
@RReverser
Copy link
Contributor Author

Updated list following rerun in #12310 and made items checkboxes for convenience (to mark when fixed). @Hainish @Bisaloo

@RReverser
Copy link
Contributor Author

Re-run the script and updated list of failures again.

@RReverser
Copy link
Contributor Author

Updated to exclude rulesets that have default_off set for now.

@jayvdb
Copy link
Contributor

jayvdb commented Feb 11, 2020

This could be 'fixed' by the rule checker extracting probable hostnames from the regexes, and then verifying them by matching them against the regex, and then verifying they are in the targets. It wouldn't catch all of them, but would get most of them, and would help prevent new ones occurring/ reduce PR reviewer effort.

jayvdb added a commit to jayvdb/https-everywhere that referenced this issue Feb 13, 2020
Subset of hosts in regex that are not in targets.
jayvdb added a commit to jayvdb/https-everywhere that referenced this issue Feb 13, 2020
Subset of hosts in regex that are not in targets.
jayvdb added a commit to jayvdb/https-everywhere that referenced this issue Feb 18, 2020
Subset of hosts in regex that are not in targets.
jayvdb added a commit to jayvdb/https-everywhere that referenced this issue Feb 18, 2020
Subset of hosts in regex that are not in targets.
jayvdb added a commit to jayvdb/https-everywhere that referenced this issue Feb 18, 2020
Rule was for .com, which wasnt listed as a target.

.com is not functional; redirect to .org which is functional.

Related to EFForg#12297
jayvdb added a commit to jayvdb/https-everywhere that referenced this issue Feb 18, 2020
Subset of hosts in regex that are not in targets.
J0WI added a commit that referenced this issue Feb 18, 2020
* Fix rule hostname typos (#12297)

Subset of hosts in regex that are not in targets.

* Whisper-Gifts.xml: Fix domain name in rules

whispergifts.com has invalid cert;
www.whispergifts.com is usable.
jayvdb added a commit to jayvdb/https-everywhere that referenced this issue Feb 19, 2020
Rule was for .com, which wasnt listed as a target.

.com is a parked domain; .org is functional.

Related to EFForg#12297
@zoracon zoracon closed this as completed Feb 1, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants