<?xml version="1.0" encoding="UTF-8"?>
<collection xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.loc.gov/MARC21/slim http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd" xmlns="http://www.loc.gov/MARC21/slim">
 <record>
  <leader>00000cab a22000003a 4500</leader>
  <controlfield tag="001">UP-99796217608517590</controlfield>
  <controlfield tag="003">Buklod</controlfield>
  <controlfield tag="005">20231007234511.0</controlfield>
  <controlfield tag="006">m    |o  d |      </controlfield>
  <controlfield tag="007">ta</controlfield>
  <controlfield tag="008">080911s        xx     d | ||r |||||   ||</controlfield>
  <datafield tag="035" ind1=" " ind2=" ">
   <subfield code="a">(iLib)UPD-00051795339</subfield>
  </datafield>
  <datafield tag="040" ind1=" " ind2=" ">
   <subfield code="a">DENGII</subfield>
  </datafield>
  <datafield tag="041" ind1=" " ind2=" ">
   <subfield code="a">eng</subfield>
  </datafield>
  <datafield tag="100" ind1="0" ind2=" ">
   <subfield code="a">Grosz, Lutz</subfield>
  </datafield>
  <datafield tag="245" ind1="0" ind2="0">
   <subfield code="a">How to vectorize the algebraic multilevel iteration.</subfield>
  </datafield>
  <datafield tag="300" ind1=" " ind2=" ">
   <subfield code="a">Vol. 26, no. 2</subfield>
   <subfield code="b">pp. 293 - 309</subfield>
  </datafield>
  <datafield tag="520" ind1=" " ind2=" ">
   <subfield code="a">We consider the algebraic multilevel iteration (AMLI) for the solution of systems of linear equations as they arise form a finite-difference discretization on a rectangular grid. Key operation is the matrix-vector product, which can efficiently be executed on vector and parallel-vector computer architectures if the nonzero entries of the matrix are concentrated in a few diagonals. In order to maintain this structure for all matrices on all levels coarsening in alternating directions is used. In some cases it is necessary to introduce additional dummy grid hyperplanes. The data movements in the restriction and prolongation are crucial, as they produce massive memory conflicts on vector architectures. By using a simple performance model the best of the possible vectorization strategies is automatically selected at runtime. Examples show that on a Fujitsu VPP300 the presented implementation of AMLI reaches about 85% of the useful performance, and scalability with respect to computing time can be achieved.</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
   <subfield code="a">Large linear systems.</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
   <subfield code="a">Multigrid method.</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
   <subfield code="a">Numerical software.</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
   <subfield code="a">Parallel processing.</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
   <subfield code="a">Preconditioned iterative solver.</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
   <subfield code="a">Vector computer.</subfield>
  </datafield>
  <datafield tag="773" ind1="0" ind2=" ">
   <subfield code="t">ACM transactions on mathematical software.</subfield>
   <subfield code="g">26, 2 (2000).</subfield>
  </datafield>
  <datafield tag="905" ind1=" " ind2=" ">
   <subfield code="a">FO</subfield>
  </datafield>
  <datafield tag="852" ind1=" " ind2=" ">
   <subfield code="a">UPD</subfield>
   <subfield code="b">DENG-II</subfield>
  </datafield>
  <datafield tag="942" ind1=" " ind2=" ">
   <subfield code="a">Article</subfield>
  </datafield>
 </record>
</collection>
