Notepad ++ XML - Etiketleri belirli içeriği içermeyen Alt Etiket İçeriğine Bağlı Olarak Koşullu Olarak Sil

Aug 19 2020

İç içe geçmiş bir alt öğenin içeriğinin doğru başlangıcını içermeyen bazı alt öğeleri kaldırmam gereken büyük bir xml dosyam var.

Xml dosyam şöyle görünüyor:

<product>
    <catalogEntry>
      <idPath><![CDATA[K212/G425638/G425649/G426239/G426265/G601769]]></idPath>
      <namePath><![CDATA[Web Katalog DK/Solar Plus/Solar Plus EL/Afsnit 12 - Kommunikations- & sikringsmateriel/Racks/Vægracks]]></namePath>
      <ImagePath><![CDATA[K212-{\pics\_catalogmanager\sz2\ikon_solarplus.jpg}{\pics\_catmandk_kampagner\sz2\ikon solar plus_el.jpg}{\pics\_catmandk_solar plus\sz2\solarplusel_afs.13.jpg}{\pics\cubic cabinet\sz2\5709832021591p.jpg}{\pics\mass creation\sz2\0000101760-6he2060020med20plade20a.jpg}]]></ImagePath>
    </catalogEntry>
    <catalogEntry>
      <idPath><![CDATA[K352/G600248/G600247]]></idPath>
      <namePath><![CDATA[Solar plus mini guide/Rack og tilbehør/Vægrack]]></namePath>
      <ImagePath><![CDATA[K352-{}{}]]></ImagePath>
    </catalogEntry>
    <catalogEntry>
      <idPath><![CDATA[K212/G425642/G444580/G444590/G444598]]></idPath>
      <namePath><![CDATA[Web Katalog DK/Kommunikation/Rack, tilbehør, kabel management/Vægrack/Solar Plus Vægrack]]></namePath>
      <ImagePath><![CDATA[K212-{\pics\_catalogmanager\sz2\ikon_kommunikation.jpg}{\pics\_catalogmanager\sz2\kommunikation_rack-skabe_.jpg}{\pics\lk dataconnect\sz2\5703302138918p.jpg}{\pics\mass creation\sz2\0000101760-6he2060020med20plade20a.jpg}]]></ImagePath>
    </catalogEntry>
    <catalogEntry>
      <idPath><![CDATA[K193/G389888/G395066/G585958/G586999/G600567]]></idPath>
      <namePath><![CDATA[PRODUCTS NOT VISIBLE IN WEB KATALOG DK/Grp7 - Kabel § Føringsveje § Data/157R - Rune Agersnap/Kampagnemails/Afsluttede kampagner/Nye Solar plus vægrack - Gældende til op med d. 05.05.19]]></namePath>
      <ImagePath><![CDATA[K193-{}{}{}{}{\pics\mass creation\sz2\0000101760-10he2050020med20plade20fri.jpg}]]></ImagePath>
    </catalogEntry>
    <catalogEntry>
      <idPath><![CDATA[K212/G425639/G426577/G426699/G426927/G426940/G600572]]></idPath>
      <namePath><![CDATA[Web Katalog DK/EL/(10.00 - 29.99) Stærkstrømsmateriel/12.00 Kapslings- og tavlemateriel/12.30 Rack-skabe inkl. tilbehør/Vægrack/Solar plus vægracks]]></namePath>
      <ImagePath><![CDATA[K212-{\pics\_catalogmanager\sz2\ikon_el.jpg}{\pics\_catalogmanager\sz2\10.00_29.99.jpg}{\pics\_catalogmanager\sz2\12.00.jpg}{\pics\cubic cabinet\sz2\5709832045535p.jpg}{\pics\cubic cabinet\sz2\5709832045399p.jpg}{\pics\mass creation\sz2\0000101760-6he2060020med20plade20a.jpg}]]></ImagePath>
    </catalogEntry>

Yalnızca silmem <![CDATA[K212gereken diğer <catelogEntry>öğeleri içeren öğeleri tutmam gerekiyor

bul ve değiştir konusunda bu ifadenin bazı varyasyonlarını denedim <catalogEntry>(?:(?!</catalogEntry>.)+[^K212](?:(?!<catalogEntry>).)+</catalogEntry>\R

ama geçerli olmayan bir ifade alıyorum.

Yanıtlar

Toto Aug 19 2020 at 15:07
  • Ctrl+H
  • Ne buldun: <(catalogEntry)>(?:(?!\1)(?!\[K212).)+</\1>\R?
  • İle değiştirin: LEAVE EMPTY
  • CHECK Match case
  • KONTROL Et Etrafı sarın
  • KONTROL Normal ifade
  • KONTROL . matches newline
  • Replace all

Açıklama:

<(catalogEntry)>        # open tag and capture tag name in group 1
                # Tempered Greedy Token
(?:                     # non capture group
  (?!\1)                  # negative lookahead, make sure we haven't catalogEntry after
  (?!\[K212)              # negative lookahead, make sure we haven't [K212 after
  .                       # any character
)+                      # end group, must appear 1 or more times
</\1>                   # close tag
\R?                     # optional linebreak

Ekran görüntüsü (önce):

Ekran görüntüsü (sonra):